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DNA ENCODING CANINE VON WILLEBRAND FACTOR 
AND METHODS OF USE 



FIELD OF THE INVENTION 
This invention relates generally to canine von Willebrand factor (vWF), and 
5 more particularly, to the gene encoding vWF as well as a genetic defect that causes 
canine von Willebrand's disease. 

BIOLOGICAL DEPOSITS 

SEQUENCE ACCESSION NO 

Canine von Willebrand Factor 

10 BACKGROUND OF THE INVENTION 

In both dogs and humans, von Willebrand's disease (vWD) is a bleeding 
disorder of variabte severity that results from a quantitative or qualitative defect in 
von Willebrand factor (vWF) (Ginsburg, D. et al., Blood 79:2507-2519 (1992) 
Ruggeri, Z.M.. et al.. FASEB J 7:308-316 (1993); Dodds. W.J.. Mod Vet Pract 681- 

15 686 (1984); Johnson. G.S. et al., JAVMA 176:1261-1263 (1988); Brooks. M., Prvbl 
In Vet Med 4:63^646 (1992)). This dotting factor has two known functions, 
stabilization of Factor VIII (hemophilic factor A) in the blood, and aiding the adhesion 
of platelets to the subendothelium. which allows them to provide hemostasis more 
effectively. If the factor is missing or defective, the patient, whether human or dog. 

20 may bleed severely. 

The disease is the most common hereditary bleeding disorder in both 
species, and is genetically and clinically heterogenous. Three clinical types, called 
1 . 2, and 3 (formerly I. II, and III; see Sadler. J.E. et al.. Blood 84:676-679 (1994) for 
nomenclature changes), have been described. Type 1 vWD is inherited in a 

25 dominant, incompletely penetrant fashion. Bleeding appears to be due to the 
reduced level of vWF rather than a qualitative difference. Although this is the most 
common form of vWD found in most mammals, and can cause serious bleeding 
problems, it b generally less severe than the other two types. In addition, a 
relatively inexpensive vasopressin analog (DDAVP) can help alleviate symptoms 

30 (Kraus, K H. et al , Vet Surg 18103-109 (1989)) 

.^—ki. a_ determines l, bpeoauzeo tests (Kuggen, M . ei a... t ASEB . 
. 308-316 (1993), Brooks, M.. Probl In Vet Med 4:636-646 (1992)). This type is also 
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inherited in a dominant fashion and has only rarely been described in dogs 
(Turrentine, M.A., et al., \ et Clin North Am Small Anim Pract 18:275 (1988)). 

Type 3 vWD is the most severe form of the disease. It Is inherited as an 
autosomal recessive trait, and affected individuals have no detectable vWF in their 
5 blood. Serious bleeding episodes require transfusions of blood or cryopreciprtate to 
supply the missing vWF. Heterozygous carriers have moderately reduced factor 
concentrations, but generally appear to have normal hemostasis. 

Scottish temers have Type 3 vWD (Dodds, W.J., Mod Vet Pract 681-686 
(1984); Johnson, G S. et al., JAVMA 176:1261-1263 (1988)). Homozygotes have 
10 no detectable vWF and have a severe bleeding disorder. Heterozygotes have 
reduced levels of the factor, and are clinically normal (Brooks, M. et al., JAVMA 
200:1123-1127 (1992)). The prevalence of vWD among Scottish terriers including 
both heterozygotes and homozygotes has been variously estimated from 27-31% 
(Stokol, T. et a!., Res. Vet. Sci. 59:152-155 (1995); Brooks, M., Proc. 9th ACVIM 
15 Forum 89-91 (1991)). 

Currently, detection of affected and earner Scottish terrier dogs is done by 
vWF antigen testing (Benson, R.E. et al., Am J Vet Res 44:399-403 (1983); Stokol, 
T. et al., Res. Vet Sci 59:152-155 (1995)) or by coagulation assays (Rosborough, 
T.K. et al., J. Lab. Clin. Med. 96:47-56 (1980); Read, M.S. et al., J. Lab. Clin. Med. 
20 101:74-82 (1983)). These procedures yield variable results, as the protein-based 
tests can be influenced by such things as sample collection, sample handling, 
estrous, pregnaftcy, vaccination, age, and hypothyroidism (Strauss, H.S. et al., New 
Eng J Med 269*1251-1252 (1963); gloom, A.L, Mayo Clin Proc 66:743-751 (1991); 
Stirling, Y. et.aL Thromb Haemostasis 52:176-182 (1984); Mansel, P.D. et al., Br. 
25 Vet J. 148:329-337 (1992); Avgeris, S. et al., JAVMA 196:921-924 (1990); Panriera, 
DP. et al., JAVMA 205:1550-1553 (1994)). Thus, for example, a dog that tests 
within the normal range on one day, can test within the carrier range on another day. 
It is therefore difficult for breeders to use this information. 

It would thus be desirable to provide the nucleic acid sequence encoding 
30 canine vWF. It would also be desirable to provide the genetic defect responsible for 
canine vWD. It would further be desirable to obtain the amino ackj sequence of 
canine vWF. It would also be desirable to provide a method for detecting carriers 
of the defective vWF gene based on the nucleic acid sequence of the normal and 
defective vWF gene. 
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SUMMARY OF THE INVENTION 

The present invention provides a novel purified and isolated nucleic acid 
sequence encoding canine vWF. A nucleic acid sequence containing the mutation 
that causes vWD in Scottish terriers, a single-base deletion in exon 4, is also 
5 provided The nucieic acid sequences of the present invention may be used in 
methods for detecting earners of the mutation that causes vWD. Such methods may 
be used by breeders to reduce the frequency of the disease-causing allele and the 
incidence of disease. In addition, the nucleic acid sequence of the canine vWF 
provided herein may be used to determine the genetic defect that causes vWD in 
10 other breeds as well as other species. 

Additional objects, advantages, and features of the present invention will 
become apparent from the following description, taken in conjunction with the 
accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 
15 The various advantages of the present invention will become apparent to one 

skilled in the art by reading the following specification and by referencing the 
following drawings in which: 

Figures 1A-1C is the nucleic acid sequence of the canine von Willebrand 
factor of the present invention; 
20 Figures 2A-2C is a comparison of the human and canine prepro-von 

Willebrand factor amino acid sequences; 

Figure 3 provides nucleotide sequencing ladders for the von Willebrand's 
disease mutation region for norma! (clear), carrier, and affected Scottish terriers, the 
sequences being obtained directly from PCR products derived from genomic DNAs 
25 in exon 4; 

Figure 4 illustrates the results of a method of the present invention used to 
detect the Scottish terrier vWD mutation; and 

Figure 5 shows the Scottish terrier pedigree, which in turn illustrates 
segregation of the mutant and normal vWF alleles. 
30 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The cDNA encoding canine von Willebrand Factor (vWF) has been 

uuv , neouence omesoo nam c tv >n. nnm- W 1 

consequently deduced and is set forth rn Figures 2A-2C and SEQ \D NO 2 The 
35 mutation of the normal vWF gene which causes von Willebrand's Disease (vWD), 
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a deletion at codon 88 of the normal gene resulting in a frameshrft, is also provided 
The nuaeic aad sequences of the present invention may be used in methods for 
detecting homozygous and heterozygous carriers of the defective vWF gene. 

In a preferred method of detecting the presence of the von WiWebrand allele 
5 in canines, DNA samples are first collected by relatively noninvasive techniques, i.e., 
DMA samples are obtained with minimal penetration into body tissues of the animals 
to be tested. Common noninvasive tissue sample collection methods may be used 
and inciude withdrawing buccal cells via cheek swabs and withdrawing Wood 
samptes. Following isolation of the DNA by standard techniques, PCR is performed 

10 on the DNA utilizing pre-designed pnmers that produce enzyme restriction sites on 
those DNA samples that harbor the defective gene. Treatment of the amplified DNA 
with appropriate restriction enzymes such as Ss/E I thus allows one to analyze for 
the presence of the defective allele. One skilled in the art will appreciate that this 
method may be applied not only to Scottish terriers, but to other breeds such as 

15 Shetland sheepdogs and Dutch Kooikers 

Overall, the present invention provides breeders with an accurate, definitive 
test whereby the undesired vWD gene may be eliminated from breeding lines. The 
current tests used by breeders are protein- based, and as noted previously, the 
primary difficulty with this type of test is the variability of results due to a variety of 

20 factors. The ultimate result of such variability is that an inordinate number of 
animals fall into an ambiguous grouping whereby carriers and noncarriers cannot be 
reliabJy distinguished. The present invention obviates the inherent limitations of 
protein-based tests by detecting the genetic mutation which causes vWD. As 
descrfced in Specific Example 1 , the methods of the present invention provide an 

25 accurate test for distinguishing noncarriers, homozygous carriers and heterozygous 
carriers of the defective vWF gene. 

It wffl be appreciated that because the vWF cDNA of the present invention 
is substantially homologous to vWF cDNA throughout the canine species, the nucleic 
acid sequences of the present invention may be used to detect DNA mutations in 

30 other breeds as well. In addition, the canine vWF sequence presented herein 
potentiaBy in combination with the established human sequence (Genbank 
Accession No. X04385, Bonthron, D. etal., Nudeic Acids Res. 14:7125-7128 (1986); 
Mancuso, D.J. et al., Biochemistry 30253-269 (1989); Meyer, D. et al., Throm 
Haemostasis 70:99-104 (1993)), may be used to facilitate sequencing of the vWF 
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gene and genetic defects causing vWD, in other mammalian species e.g., by using 
cross-species PCR methods known by those skilled in the art 

It is also within the contemplation of this invention that the isolated and 
purified nucleic acid sequences of the present invention be incorporated into an 
5 appropnate recombinant expression vector, e.g., viral or plasmid, which is capable 
of transforming an appropriate host cell, either eukaryotic (e.g., mammalian) or 
prokaryotic (e.g., £. colt). Such DNA may involve alternate nudeic acid forms, such 
as cONA, gDNA, and DNA prepared by partial or total chemical synthesis. The DIMA 
may also be accompanied by additional regulatory elements, such as promoters, 
10 operators and regulators, which are necessary and/or may enhance the expression 
of the vWF gene product In this way, cells may be induced to over-express the 
vWF gene, thereby generating desired amounts of the target vWF protein. It is 
further contemplated that the canine vWF polypeptide sequence of the present 
invention may be utilized to manufacture canine vWF using standard synthetic 
15 metnods. One skilled in the art will also note that the defease protein encoded by 
the defective vWF gene of the present invention may also be of use in formulating 
a complementary diagnostic test for canine vWD that may provide further data in 
establishing the presence of the defective allele. Thus, production of the defective 
vWF polypeptide, either through expression in transformed host cells as described 
20 above for the active vWF polypeptide or through chemical synthesis, is also 
contemplated by the present invention. 

The term -gene" as to referred herein means a nudeic acid which encodes 
a protein product. The term "nudeic acid" refers to a linear array of nudeotides and 
nucleosides, such as genomic DNA, cDNA and DNA prepared by partial or total 
25 chemical synthesis from nudeotides. The term "encoding" means that the nudeic 
add may be transcribed and translated into the desired polypeptide. "Polypeptide" 
refers to amino acid sequences which comprise both ftiWength proteins and 
fragments thereof. "Mutation" as referred to herein indudes any alteration in a 
nudeic add sequence including, but not limted to, deletions, substitutions and 
30 additions. 

As referred to herein, the term "capable of hybridizing under high stringency 

conditions" means annealtnq a s^ra^ vn^A ^n—^i 

H riu iiinaem .onaroon f.ewtse aoaDte -wnnaizin:: fXH 1 - - 
;angency conditions reters to annealing a strand of DNA complementary to the 
35 DNA of interest under low stnngency conditions In the present invention, hybridizing 
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under either high or low stringency conditions would involve hybridizing a nucleic 
acid sequence (e.g., the complementary sequence to SEQ ID NO: 1 or portion 
thereof), with a second target nucleic aad sequence. "High stringency conditions" 
for the annealing process may involve, for example, high temperature and/or low salt 
5 content which disfavor hydrogen bonding contacts among mismatched base pairs. 
"Low stringency conditions" would involve lower temperature, and/or lower salt 
concentration than that of high stnngency conditions. Such conditions allow for two 
DNA strands to anneal if substantial, though not near complete complementarity 
exists between the two strands, as is the case among DNA strands that code for the 

10 same protein but differ in sequence due to the degeneracy of the genetic code. 
Appropriate stnngency conditions which promote DNA hybridization, for example, 6X 
SSC at about 45 *C, followed by a wash of 2X SSC at 50 *C are known to those 
skilled in the art or can be found in Current Protocols in Molecular Biology, John 
Wiley & Sons, NY (1989), 6.31-6.3.6. For example, the salt concentration in the 

15 wash step can be selected from a low stringency of about 2X SSC at 50 *C to a high 
stringency of about 0.2X SSC at 50 *C In addition, the temperature in the wash 
step can be increased from low stringency at room temperature, about 22 *C, to high 
stringency conditions, at about 65 *C. Other stringency parameters are described 
in Maniatis, T. ( et al. t Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 

20 Laboratory Press, Cold Spring NY, (1982), at pp. 387-389; see also Sambrook J. et 
a!.. Molecular Cloning: A Laboratory Manual, Second Edition, Volume 2, Cold Spring 
Harbor Laboratory Press, Cold Spring, NY at pp. 8.46-8.47 (1989). 

SPECIFIC EXAMPLE 1 
Materials And Methods 

25 Isolation of RNA. The source of the RNA was a uterus from a Scottish 

Terrier affected with vWD (factor level < 0.1% and a clinical bleeder), that was 
surgically removed because of Infection. Spleen tissue was obtained from a 
Doberman Pinscher affected with vWD that died from dilated cardiomyopathy (factor 
level 7% and a clinical bleeder). Total RNA was extracted from the tissues using 

30 Trizol (Life Technologies, Garthersburg, MD). The integrity of the RNA was 
assessed by agarose gel electrophoresis. 

Design ofPCR primer sets. Primers were designed to a few regions of the 
gene, where sequences from two species were available (Lavergne, J.M. et al. f 
Biochem Biophys Res Commun 194:1019-1024 (1993); Bakhshi, M.R. et al., 

35 Biochem Biophys Acta 1132:325-328 (1992)). These primers were designed using 
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rules for cross-speaes amplifications (Venta et al , "Genes-Specific Universal 
Mammalian Sequence-Tagged Sites: Application To The Canine Genome" Biochem 
Genet. (1996) in press). Most of the pnmers had to be designed to other regions of 
tne gene using the human sequence alone (Mancuso, D.J. et al., Biochemistry 
5 30:253-269 (1991)). Good amplification conditions were determined by using human 
and canine genomic DMAs, 

Reverse Transcriptase-PCR Total RNA was reverse transcribed using 
random primers (Bergenhem, N.C.H. etaL, PNAS (USA) 89:8789-8802 (1992)). The 
cDNA was amplified using the primer sets shown to work on canine genomic DMA. 
10 DNA Sequence Analysis. Amplification products of the predicted sizes were 

isolated from agarose gels by adsorption onto silica gel particles using the 
manufacturer's method (Qiagen, Chatsworth, CA). Sequences were determined 
using M P-5' end-labeled primers and a cycle sequencing kit (United States 
Biochemical Corp., Cleveland, OH). The sequences of the 5' and 3' untranslated 
15 regions were determined after amplification using Marathon™ RACE kits (Clontech, 
Palo Alto, CA) Sequences were aligned using the Eugene software analysis 
package (Larfc Technologies, Houston, TX) The sequence of the canine intron four 
was determined from PCR-ampIrfied genomic DNA. 

Design of a Diagnostic Test PGR mutagenesis was used to create 
20 diagnostic and control Bs/E I and Sai/96 I restriction enzyme sites for the test. 
Amplification conditions for the test are: 94*C, 1 min, 61*C, 1 min, and 72'C, 1 min. 
for 50 cycles using cheek swab DNA (Richards, B. et al. t Human Molecular Genetics 
2:159-163 (1992)). 

Population Survey. DNA was collected from 87 Scottish terriers from 16 
25 pedigrees. DNA was isolated either from Wood using standard procedures 
(Sambrook, J. et aL, Cold Harbor Spring Lab, Cold Harbor Spring NY, 2nd Edition, 
(1989)) or by cheek swab samples (Richards, B. et al., Human Molecular Genetics 
2:159-163 (1992)). The genetic status of each animal in the survey was determined 
using the BsiE I test described above. 
30 Results 

Comparison of the canine and human sequences. The alignment of the 

:uruh - r>e kocatKDn ' Tne ^cotter mmm vW! 1 matin- - -naicateo '^v ?r- 
Potential N-grtycosytation srtes are shown m hold type The known ana 
35 postulated mtegrin binding sites are boxed Amino add numbers are shown on the 
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right side of the figure The human sequence is derived from Genbank accession 
number X04385 (Bonthron, D. et al.. Nudetc Acids Res 14:7125-7128 (1986)). 

Overall, 85.1% sequence identity is seen between the prepro-vWF 
sequences. The pro-region is slightly less conserved than the mature protein (81 4% 
5 vs. 87.5%). There were no other noteworthy percentage sequence identity 
differences seen in other regions of the gene, or between the known repeats 
contained within the gene (data not shown). Fourteen potential N-linked 
glycosylation sites are present in the canine sequence, all of which correspond to 
similar sites contained within the human sequence. The two integrin binding sites 

10 identified in the human vWF protein sequence (Lankhof, H. et al.. Blood 86:1035- 
1042 (1995)) are conserved in the canine sequence as well (Figures 2A-2C). The 
5' and 3' untranslated regions have diverged to a greater extent than the coding 
region (data not shown), comparable to that found between the human and bovine 
sequences denved for the 5' flanking region (Janel, N. et al., Gene 167:291-295 

15 (1995)). Additional insights into the structure and function of the von Willebrand 
factor can be gained by comparison of the complete human sequence (Mancuso, 
DJ. et al.. Biochemistry 30:253-269 (1989); Meyer, D. et ai., Throw Haemostasis 
70:99-104 (1993)) and the complete canine sequence reported here. 

The sequence for most of exon 28 was determined (Mancuso, DJ. et al. ( 

20 Thromb Haernost 69:980 (1993); Porter, CA et al M Mol Phyiogenet Evol 5:89-101 
(1996)). All three sequences are in complete agreement, although two silent 
variants have been found in other breeds (Table 1, exon 28). Partial sequences of 
exons 40 and 41 (cDNA nucleotide numbers 6923 to 7155, from the initiation codon) 
were also determined as part of the development of a polymorphic simple tandem 

25 repeat genetic marker (Shibuya, H. et al., Anim Genet 24:122 (1994)). There is a 
single nucleotide sequence difference between this sequence (T") and the 
sequence of the present invention, fC") at nucleotide position 6928. 

Scottish Terrier vWD mutation. Figure 3 shows nucleotide sequencing 
ladders for the von Willebrand' s Disease mutation region for normal (dear), earner, 

30 and affected Scottish terriers. The sequences were obtained directly from PCR 
products derived from genomic DNAs in exon 4. The arrowheads show the location 
of the C nucleotide that is deleted in the disease-causing allele! Note that in the 
carrier ladder each base above the point of the mutation has a doublet appearance, 
as predicted for deletion mutations. The factor levels reported for these animals 

35 were: Normal, 54%; Carrier, 34%; Affected, <0.1%. 
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As a result of the deletion, a frameshift mutation at cooon 88 leads to a new 
stop codon 103 bases downstream. The resulting severely truncated protein of 1 19 
amino aads doej not include any of the mature von Willebrand factor region. The 
identity of the base in the normal allele was determined from an unaffected dog 

Development of a diagnostic test A PCR primer was designed to produce 
a fis/E I site in the mutant aflete but not in the normal allele (Fkjure 4). The position 
of the deleted nucleotide is indicated by an astensk. The altered nucleotides in each 
pnmer are underlined. The normal and mutant allele can also be distinguished using 
Sau96 I. The naturally occurring Sau96 I sites are shown by double underlines. 
The highly conserved donor and acceptor dinucieotide splice sequences are shown 
in bold type 

In order to ensure that the restriction enzyme cut the amplified DNA to 
completion, an internal control restriction site common to both alleles was designed 
into the non-diagnostic primer. The test was verified by digestion of the DNA from 
animals that were affected, obligate carriers, or normal (based on high factor levels 
[greater than 100% of normal] obtained from commonly used testing labs and 
reported to us by the owners, and also using breeds in which Type 3 vWD has not 
been observed). The expected results were obtained (e.g., Rgure 5). Five vWD- 
affected animals from a colony founded from Scottish terriers (Brinkhous, K.M. et al., 
Ann. New York Acad. Sci. 370:191-203 (1981)) were also shown to be homozygous 
for this mutation. An additional unaffected animal from this same colony was found 
to be clear. 

It would still be possible to misinterpret the results of the test if restriction 
enzyme digestion was not complete, and if the rates of deavage of the cont778rol 
and diagnostic sites were vastly different. The rates of deavage of the two BsiE I 
sites were thus examined by partially digesting the PCR products and running them 
on capillary electrophoresis. The rates were found to be very nearly equal (the 
diagnostic site is cut 12% faster than the control stte). 

The mutagenesis primer was also designed to produce a Sau96 I site into the 
normal allele but not the mutant allele. This is the reverse relationship compared to 
the BslE dependent test, with respect to which allele is cut Natural internal Sau96 
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A possible mutation in the Doberman Pinscher gene. The complete 
Scottish terrier sequence was compared to the complete Doberman Pinscher 
sequence. Several nucleotide differences were found and were compared to the 
nucleotides found in the same position in the human sequence as shown in Table 
1 below. Most of these changes were silent. However, of three amino acid 
changes, one is relatively non-conservative (F905L) and is proposed to be the 
mutation that causes Doberman Pinscher vWD. Other data strongly suggest that the 
nucleotide interchange at the end of exon 43 causes a cryptic splice site to be 
activated reducing the amount of normally processed mRNA, with a concomitant 
decrease in the amount of vWF produced 

Mendelian inheritance One test often used to verify the correct 
identification of a mutant allele is its inheritance according to Mendel's law of 
segregation Three pedigrees were examined in which the normal and mutant 
alleles were segregating, as shown in Figure 5. Exon four of the vWF gene was 
PCR-amplified from genomic DNA. The PCR products were examined for the 
presence of the normal and mutant vWF alleles by agarose gel electrophoresis after 
digestion with BsB I (see Figure 5). The affected animals are homozygous for the 
mutant allele (229 bp; lanes 3 and 5). The other animals in this pedigree are 
heterozygotes (251 bp and 229 bp; lanes 1, 2, 4. and 6), including the obligate 
carrier parents. 
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Tablc 1 - Differences Between Scottie And Doberman 
Protein And Nucleotide von Willebrand Factor Sequences 
With Comparison To The Human Sequences 











Anwx> Acid 






Codon 






Ex on 


AA' 




Scotbe 


Dobefman 


Human 


Scott* 


Doberman 


c 

D 


5' UJ 2 


•4 1-1 

ncc - 35 


N/A* 


N/A 


N/A 


N/A 


A 


G 




A 


65 


S 


S/TSW 


S 


TCC 


TCC/TC_ 


TCC 




5 


173 


M 


R 


K 


ATG 


AGG 


AAG I 








S 


T 


T 


TCC 


ACA 


ACC 






fl no 






C 


TGC 


TGT 


TGC jj 


10 


*? 1 


905 


F 


F 


I 


TTT 


TTC 


TTA I 




24 


1041 


S 


S 


5 


TCA 


TCA 


TCG | 




24 


1042 


s 


s 


S 


TCC 


TCC 


TCA j 




26 


1333 


D 


D 


E 


GAC 


GAC 


GAG 




28 


1349 


Y 


Y 


Y 


TAT 


TAT 


TAC* 


15 


42 


2381 


P 


L 


P 


CCC 


CTG 


CCG 




43 


2479 


S 


S 


S 


TCG 


TCG 


TCA 




45 


2555 


P 


P 


P 


CCC 


CCC 


CCG 




47 


2591 


P 


P 


P 


CCC 


CCT 


CCC 




49 


2672 


0 


D 


0 


GAT 


GAT 


GAC 


20 


51 


2744 


E 


E 


E 


GAG 


GAG 


GAA 



1 Amino arid residue position 
Untranslated region 
Nucleotide position 
4 Not Applicable 
25 5 Frameshrft mutation 

Boxed residues show amino acid differences between breeds 
This site has been shown to be polymorphic in some breeds 
The mature VWF protein begins in exon 18 



The alleles, as typed by both the BsE I and Sau96 I tests, showed no 
30 inconsistencies with Mendelian inheritance. One of these pedigrees included two 
affected animals, two phenotypicalhy normal siblings, and the obligate carrier parents. 

' : 'Jnc ; r>e f ^Tiozvaouv - - -rat a it aueir — • -,!- rnp WMri 

.:una to t>e tieterozvaotes 
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Population survey for the mutation Cheek swabs or blood samples were 
collected from 87 animals in order to determine the incidence of carriers in the U S 
Scottish terrier population. Although we attempted to make the sample as random 
as possible, these dogs were found to come from 16 pedigrees, several of which are 
5 more distantly interconnected. This is due to some ascertainment bias, based on 
ownership (as opposed to phenotypic ascertainment bias). In these 87 animals four 
affected and 15 carrier animals were found. 

Discussion 

These results establish that the single base deletion found in exon four of the 
10 vWF gene causes vWD in the Scottish terrier breed. The protein produced from the 
mutant allele is extremely short and does not indude any of the mature vWF protein 
Four Scottish terriers known to be affected with the disease are homozygous for the 
mutation. Five other mixed-breed dogs descended from Scottish terners, and 
affected with vWD, are also homozygous for the mutation. No normal animals are 
15 homozygous for the mutation. Unaffected obligate carriers are always heterozygous 
for the mutation. 

The gene frequency, as determined from the population survey, appears to 
be around 0.13 resulting in a heterozygote frequency of about 23% and expected 
frequency of affected animals of about 2%. Although the sample size is relatively 

20 small and somewhat biased, these data are in general agreement with the protein- 
based surveys (Stokol, T. et al., Res Vet Sci 59:152-155 (1995); Brooks, M., Pmbl 
In Vet Med 4:636-646 (1992)), in that the allele frequency is substantial. 

All data collected thus far indicate that this mutation accounts for essentially 
all of the von WIBebrand's disease found in Scottish terriers. This result is consistent 

25 with the results found for other genetic diseases, defined at the molecular level, in 
various domestic animals (Shuster, D.E. etal. ( PNAS (USA) 89:9225-9229 (1992); 
Rudolph, JA. et al., Nat Genet 2:144-147 (1992); O'Brien, PJ. et al., JAVMA 
203:842-851 (1993)). A likely explanation may be found in the pronounced founder 
effect that occurs in domestic animals, compared to most human and wild animal 

30 populations. 

Published data using the protein-based factor assays have shown that, at 
least in several instances, obligate carriers have had factor levels that would lead 
to a diagnosis of "dear" of the disease allele. For example, in one study an obligate 
carrier had a factor level of 78% (Johnson, G.S. et al., JAVMA 176:1261-1263 
35 (1980)). In another study, at least some of the obligate carriers had factor levels of 
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65% or greater (Bnn*hous, K.M. et al. , Ann. New York Acaa Set. 370.191-203 
(1981)) In addition, the number of animals that fall into an equivocal range can be 
substantial. In one study, 19% of Scottish temers fell in this range (50-65% of the 
normal vWF antigen level) (Stokol, T. et al., Res Vet Sc/ 59:152-155 (1995)) Thus, 
5 although the protein-based tests have been useful, the certainty of the DISlA-based 
test described herein should relieve the necessity of repeated testing and the 
vanacility associated with tne protein-based assays. 

The mutation is present in the pre-vWF part of the molecule. This part of the 
molecule is processed off prior to delivery of the mature protein into the plasma. 
10 This pre-portion of the molecule is important for the assembly of the mature vWF 
protein (verwiej, L. et al., EBMO J 6:2885-2890 (1987); Wise, R.J et al., Cell 
52:229-236 (1988)). With the Scottish terrier frameshift vWD mutation, neither this 
pre-portion nor any of the mature factor is ever produced, in keeping with the fact 
that no factor has ever been detected in the blood of affected dogs. 
15 The determination of the complete canine vWF cDNA sequence will have an 

impact upon the development of carrier tests for other breeds and other species as 
well. Currently, Shetland sheepdogs and Dutch Kooikers are known to have a 
significant amount of Type 3 vWD (Brooks, M. etal., JAVMA 200:1123-1127 (1992); 
Siappendel, R.J., Vet-Q 17:S21-S22 (1995)). Type 3 vWD has occasionally be seen 
20 in other breeds as well (e.g., Johnson, G.S. et al., JAVMA 176:1261-1263 (1980)). 
All Type 3 vWD mutations described in humans to date have been found within the 
vWF gene itself. The avaflabiiity of the canine sequence will make it easier to find 
the mutations in these breeds. In addition, at teast some Type 1 mutations have 
been found within the human vWF gene, and thus Type 1 mutations may also be 
25 found within the vWF gene for breeds affected with that form of the disease. The 
availability of two divergent mammalian vWF cDNA sequences will also make it 
much easier to sequence the gene from other mammalian species using cross- 
species PGR methods (e.g., Venta et aL, Biochem. Genet (1996) in press). 

The test described herein for the detection of the mutation in Scottish terriers 
30 may be performed on small amounts of DMA from any tissue. The tissues that are 
the teast invasive to obtain are Wood and buccal cells. For maximum convenience, 
a cheek swab as a source of DMA nrefpn^n 

l iuoaiments ot tne present invention One suited in tne art wt\ readily rprognee 
3b from such discussion, and from the accompanying drawings, that vanous changes, 
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modifications and vanattons can be made therein without departing from the spirit 
and scope of the invention 

All patents and other publications cited herein are expressly incorporated by 

reference. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CATTAAAAGG TCCTGGCTGG GAGCTTTTTT TTGGGACCAG CACTCCATGT TCAAGGGCAA 6 0 

ACAGGGGCCA ATTAGGATCA ATCTTTTTTC TTTCTTTTTT TAAAAAAAAA AATTCTTCCC 12 0 

ACTTTGCACA CGGACAGTAG TACATACCAG TAGCTCTCTG CGAGGACGGT GATCACTAAT 180 

CATTTCTCCT GCTTCGTGGC AG ATG AGT CCT ACC AGA CTT GTG AGG GTG CTG 232 

Met Ser Pro Thr Arg Leu Val Arg Val Leu 
1 5 io 

CTG GCT CTG GCC CTC ATC TTG CCA GGG AAA CTT TGT ACA AAA GGG ACT 2SC 
Leu Ala Leu Ala Leu lie Leu Pro Gly Lys Leu Cys Thr Lys Gly Thr 
15 20 25 

GTT GGA AGG TCA TCG ATG GCC CGA TGT AGC CTT CTC GGA GGT GAC TTC 328 
Val Gly Arg Ser Ser Met Ala Arg Cys Ser Leu Leu Gly Gly Asp Phe 
30 35 40 



ATC AAC ACC TTT GAT GAG AGC ATG TAC AGC TTT GCG GGA GAT TGC AGT 
lie Asn Thr Phe Asp Glu Ser Met Tyr Ser Phe Ala Gly Asp Cys Ser 
45 50 55 



376 



TAC CTC CTG GCT GGG GAC TGC CAG GAA CAC TCC ATC TCA CTT ATC GGG 424 
Tyr Leu Leu Ala Gly Asp Cys Gin Glu His 3er He Ser Leu He Gly 
60 65 70 

GGT TTC CAA AAT GAC AAA AGA GTG AGC CTC TCC GTG TAT CTC GGA GAA 4 72 

Gly Phe Gin Asn Asp Lys Arg Val Ser Leu Ser Val Tyr Leu Gly Glu 
75 80 BS 90 

TTT TTC GAC ATT CAT TTG TTT GTC AAT GGT ACC ATG CTG CAG GGG ACC 520 
Phe Phe Asp He His Leu Phe Val Asn Gly Thr Met Leu Gin Gly Thr 
95 100 105 

CAA AGC ATC TCC ATG CCC TAC GCC TCC AAT GGG CTG TAT CTA GAG GCC 568 
Gin Ser lie Ser Met Pro Tyr Ala Ser Asn Gly Leu Tyr Leu Glu Ala 
110 H5 120 

GAG GCT GGC TAC TAC AAG CTG TCC AGT GAG GCC TAC GGC TTT GTG GCC 616 
Glu Ala Gly Tyr Tyr Lys Leu Ser Ser Glu Ala Tyr Gly Phe Val Ala 
125 130 135 

AGA ATT GAT GGC AAT GGC AAC TTT CAA GTC CTG CTG TCA GAC AGA TAC 664 
Arg He Asp Gly Asn Gly Asn Phe Gin Val Leu Leu Ser Asp Ara Tvr 
140 145 150 * a y 

TTC AAC AAG ACC TGT GGG CTG TGT GGC AAC TTT AAT ATC TTT GCT GAG 712 
Phe Asn Lys Thr Cys Gly Leu Cys Gly Asn Phe Asn He Phe Ala Glu 
!55 160 165 170 
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GAT GAC TTC AAG ACT CAA GAA GGG ACG TTG ACT TCG GAC CCC TAT GAC 76 0 

Asp Asp Phe Lys Thr Gin Glu Gly Thr Leu Thr Ser Asp Pro Tyr Asp 
175 180 185 

TTT GCC AAC TCC TGG GCC CTG AGC AGT GGG GAA CAA CGG TGC AAA CGG 8 08 

Phe Ala Asn Ser Trp Ala Leu Ser Ser Gly Glu Gin Arg Cys Lys Arg 
190 195 200 

GTG TCC CCT CCC AGC AGC CCA TGC AAT GTC TCC TCT GAT GAA GTG CAG 8 56 

Val Ser Pro Pro Ser Ser Pro Cys Asn Val Ser Ser Asp Glu Val Gin 
205 210 215 

CAG GTC CTG TGG GAG CAG TGC CAG CTC CTG AAG AGT GCC TCG GTG TTT 9 04 

Gin Val Leu Trp Glu Gin Cys Gin Leu Leu Lys Ser Ala Ser Val Phe 
220 225 230 

GCC CGC TGC CAC CCG CTG GTG GAC CCT GAG CCT TTT GTC GCC CTG TGT 9 52 

Ala Arg Cys His Pro Leu Val Asp Pro Glu Pro Phe Val Ala Leu Cys 
235 " 240 245 250 

GAA AGG ACT CTG TGC ACC TGT GTC CAG GGG ATG GAG TGC CCT TGT GCG 100 0 

Glu Arg Thr Leu Cys Thr Cys Val Gin Gly Met Glu Cys Pro Cys Ala 
255 260 265 

GTC CTC CTG GAG TAC GCC CGG GCC TGT GCC CAG CAG GGG ATT GTC TTG 104 8 

Val Leu Leu Glu Tyr Ala Arg Ala Cys Ala Gin Gin Gly lie Val Leu 
270 275 2B0 

TAC GGC TGG ACC GAC CAC AGC GTC TGC CGA CCA GCA TGC CCT GCT GGC 10 96 

Tyr Gly Trp Thr Asp His Ser Val Cys Arg Pro Ala Cys Pro Ala Gly 
285 290 295 

ATG GAG TAC AAG GAG TGC GTG TCC CCT TGC ACC AGA ACT TGC CAG AGC 114 4 

Met Glu Tyr Lys Glu Cys Val Ser Pro Cys Thr Arg Thr Cys Gin Ser 
300 305 310 

CTT CAT GTC AAA GAA GTG TGT CAG GAG CAA TGT GTA GAT GGC TGC AGC 1192 
lieu His Val Lys Glu Val Cys Gin Glu Gin Cys Val Asp Gly Cys Ser 
315 320 325 330 

TGC CCC GAG GGC CAG CTC CTG GAT GAA GGC CAC TGC GTG GGA AGT GCT 1240 
Cys Pro Glu Gly Gin Leu Leu Asp Glu Gly His Cys Val Gly Ser Ala 
335 340 345 

GAG TGT TCC TOT GTG CAT GCT GGG CAA CGG TAC CCT CCG GGC GCC TCC 1268 
Glu Cyc Ser Cys Val His Ala Gly Gin Arg Tyr Pro Pro Gly Ala Ser 
350 355 360 

CTC TTA CAG GAC TGC CAC ACC TGC ATT TGC CGA AAT AGC CTG TGG ATC 13 36 

Leu Leu Gin Asp Cys His Thr Cys lie Cys Arg Asn Ser Leu Trp lie 
365 370 375 

TGC AGC AAT GAA GAA TGC CCA GGC GAG TGT CTG GTC ACA GGA CAG TCC 1384 
Cys Ser Asn Glu Glu Cys Pro Gly Glu Cys lieu Val Thr Gly Gin Ser 
380 385 390 

CAC TTC AAG AGC TTC GAC AAC AGG TAC TTC ACC TTC AGT GGG GTC TGC 1432 
His Phe Lys Ser Phe Asp Asn Arg Tyr Phe Thr Phe Ser Gly Val Cys 
395 400 405 410 

ca^ TA n crn ctc gc cag ga<~ tgc r^n ^a^ ^— 



jt: . tj": oa: ga^ ctc gat gct gt; - tgc acc cgc: 
va* o^n cys A^a Asp Asp Leu Asp Ala Val Cys Thr Arg 
430 435 440 
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TCG GTC ACC GTC CGC CTG CCT GGA CAT CAC AAC AGC CTT GTG AAG CTG 1576 

Ser Val Thr Val Arg Leu Pro Gly His His Asn Ser Leu Val Lys Leu 
445 450 455 

AAG AAT GGG GGA GGA GTC TCC ATG GAT GGC CAG GAT ATC CAG ATT CCT 1624 
Lys Asn Gly Gly Gly Val Ser Met Asp Gly Gin Asp lie Gin lie Pro 

460 465 470 

CTC CTG CAA GGT GAC CTC CGC ATC CAG CAC ACC GTG ATG GCC TCC GTG 16 72 

Leu Leu Gin Gly Asp Leu Arg lie Gin His Thr Val Met Ala Ser Val 
475 480 485 490 

CGC CTC AGC TAC GGG GAG GAC CTG CAG ATG GAT TCG GAC GTC CGG GGC 17 2 0 

Arg Leu Ser Tyr Gly Glu Asp Leu Gin Met Asp Ser Asp Val Arg Gly 
495 500 505 

AGG CTA CTG GTG ACG CTG TAC CCC GCC TAC GCG GGG AAG ACG TGC GGC 17 6 8 

Arg Leu Leu Val Thr Leu Tyr Pro Ala Tyr Ala Gly Lys Thr Cys Gly 
510 515 520 

CGT GGC GGG AAC TAC AAC GGC AAC CGG GGG GAC GAC TTC GTG ACG CCC 1816 
Arg Gly Gly Asn Tyr Asn Gly Asn Arg Gly Asp Asp Phe Val Thr Pro 
525 530 535 

GCA GGC CTG GCG GAG CCC CTG GTG GAG GAC TTC GGG AAC GCC TGG AAG 1864 
Ala Gly Leu Ala Glu Pro Leu Val Glu Asp Phe Gly Asn Ala Trp Lys 

540 545 550 

CTG CTC GGG GCC TGC GAG AAC CTG CAG AAG CAG CAC CGC GAT CCC TGC 1912 
Leu Leu Gly Ala Cys Glu Asn Leu Gin Lys Gin His Arg Asp Pro Cys 
555 560 565 570 

AGC CTC AAC CCG CGC CAG GCC AGG TTT GCG GAG GAG GCG TGC GCG CTG i96 0 

Ser Leu Asn Pro Arg Gin Ala Arg Phe Ala Glu Glu Ala Cys Ala Leu 
575 580 585 

CTG ACG TCC TCG AAG TTC GAG CCC TGC CAC CGA GCG GTG GGT CCT CAG 2008 
Leu Thr Ser Ser Lys Phe Glu Pro Cys His Arg Ala Val Gly Pro Gin 
590 595 600 

CCC TAC GTG CAG AAC TGC CTC TAC GAC GTC TGC TCC TGC TCC GAC GGC 2056 
Pro Tyr Val Gin Asn Cys Leu Tyr Asp Val Cys Ser Cys Ser Asp Gly 
605 610 615 

AGA GAC TGT CTT TGC AGC GCC GTG GCC AAC TAC GCC GCA GCC GTG GCC 2104 
Arg Asp Cys Leu Cys Ser Ala Val Ala Asn Tyr Ala Ala Ala Val Ala 

620 625 630 

CGG AGG GGC GTG CAC ATC GCG TGG CGG GAG CCG GGC TTC TGT GCG CTG 2152 
Arg Arg Gly Val Hia He Ala Trp Arg Glu Pro Gly Phe Cys Ala Leu 
635 640 645 650 

AGC TGC CCC CAG GGC CAG GTG TAC CTG CAG TGT GGG ACC CCC TGC AAC 2200 
Ser Cys Pro Gin Gly Gin Val Tyr Leu Gin Cys Gly Thr Pro Cys Asn 
655 660 665 

ATG ACC TGT CTC TCC CTC TCT TAC CCG GAG GAG GAC TGC AAT GAG GTC 2248 
Met Thr Cys Leu Ser Leu Ser Tyr Pro Glu Glu Asp Cys Asn Glu Val 
€70 675 680 

TGC TTG GAA AGC TGC TTC TCC CCC CCA GGG CTG TAC CTG GAT GAG AGG 2296 
Cys Leu Glu Ser Cys Phe Ser Pro Pro Gly Leu Tyr Leu Asp Glu Arg 
685 690 695 

GGA GAT TGT GTG CCC AAG OCT CAG TGT CCC TGT TAC TAT GAT GGT GAG 2344 
Gly Asp Cys Val Pro Lys Ala Gin Cys Pro Cys Tyr Tyr Asp Gly Glu 

700 705 710 



WO 98/03683 



PCTOJS97/12606 



ATC TTT CAG CCC GAA GAC ATC TTC TCA GAC CAT CAC ACC ATG TGC TAC 2 3 92 

He Phe Gin Pro Glu Asp He Phe Ser Asp His His Thr Met Cys Tyr 

715 720 725 730 

TGT GAG GAT GGC TTC ATG CAC TGT ACC ACA AGT GGA GGC CTG GGA AGC 2 44 0 

Cys Glu Asp Gly Phe Met His Cys Thr Thr Ser Gly Gly Leu Gly Ser 
735 740 745 

CTG CTG CCC AAC CCG GTG CTC AGC AGC CCC CGG TGT CAC CGC AGC AAA 2 4 88 

Leu Leu Pro Asn Pro Val Leu Ser Ser Pro Arg Cys His Arg Ser Lys 
750 755 760 

AGG AGC CTG TCC TGT CGG CCC CCC ATG GTC AAG TTG GTG TGT CCC GCT 2 5 36 

Arg Ser Leu Ser Cys Arg Pro Pro Met Val Lys Leu Val Cys Pro Ala 
765 770 775 

GAT AAC CCG AGG GCT GAA GGA CTG GAG TGT GCC AAA ACC TGC CAG AAC 2 5 84 

Asp Asn Pro Arg Ala Glu Gly Leu Glu Cys Ala Lys Thr Cys Gin Asn 
760 785 790 

TAT GAC CTG CAG TGC ATG AGC ACA GGC TGT GTC TCC GGC TGC CTC TGC 2 6 32 

Tyr Asp Leu Gin Cys Met Ser Thr Gly Cys Val Ser Gly Cys Leu Cys 
795 800 805 810 

CCG CAG GGC ATG GTC CGG CAT GAA AAC AGG TGT GTG GCG CTG GAA AGA 26 8 0 

Pro Gin Gly Met Val Arg His Glu Asn Arg Cys Val Ala Leu Glu Arg 
815 820 825 

TGT CCC TGC TTC CAC CAA GGC CAA GAG TAC GCC CCA GGA GAA ACC GTG 2728 
Cys Pro Cys Phe His Gin Gly Gin Glu Tyr Ala Pro Gly Glu Thr Val 
830 835 840 

AAA ATT GAC TGC AAC ACT TGT GTC TGT CGG GAC CGG AAG TGG ACC TGC 2776 
Lys He Asp Cys Asn Thr Cys Val Cys Arg Asp Arg Lys Trp Thr Cys 
845 850 855 

ACA GAC CAT GTG TGT GAT GCC ACT TGC TCT GCC ATC GGC ATG GCG CAC 2624 
Thr Asp His Val Cys Asp Ala Thr Cys Ser Ala lie Gly Met Ala His 
860 865 B70 

TAC CTC ACC TTC GAC GGA CTC AAG TAC CTG TTC CCT GGG GAG TGC CAG 2872 
Tyr Leu Thr Phe Asp Gly Leu Lys Tyr Leu Phe Pro Gly Glu Cys Gin 
875 880 885 890 

TAT GTT CTG GTG CAG GAT TAC TGC GGC AGT AAC CCT GGG ACC TTA CGG 2920 
Tyr Val Leu Val Gin Asp Tyr Cys Gly Ser Asn Pro Gly Thr Leu Arg 

895 900 905 

ATC CTG GTG GGG AAC GAG GGG TGC AGC TAC CCC TCA GTG AAA TGC AAG 2968 
He Leu Val Gly Asn Glu Gly Cys Ser Tyr Pro Ser Val Lys Cys Lys 
910 91S 920 

AAG CGG GTC ACC ATC CTG GTG GAA GGA GGA GAG ATT GAA CTG TTT GAT 3016 
Lys Arg Val Thr He Leu Val Glu Gly Gly Glu He Glu Leu Phe Asp 
925 930 935 

GGG GAG GTG AAT GTG AAG AAA CCC ATG AAG GAT GAG ACT CAC TTT GAG 3064 
Gly Glu Val Asm Val Lys Lys Pro Met Lys Asp Glu Thr His Phe Glu 
940 945 950 

GTG GTA GAG TTT GCT CAG TA~ T" " 



TGO GAC CAC Co AG. A TV TCT GTG ACC CTG AA. ; CO, 

Va* Trp Asp Hie Arg Leu Ser lie Ser Val Thr Leu Lys Arg 
975 980 965 
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ACA TAC CAG GAG CAG GTG TGT GGC CTG TGT GGG AAT TTT GAT GGC ATC 3208 
Thr Tyr Gin Glu Gin Val Cys Gly Leu Cys Gly Asn Phe Asp Gly He 
990 995 1000 

CAG AAC AAT GAT TTC ACC AGC AGC AGC CTC CAA ATA GAA GAA GAC CCT 3 2 56 

Gin Asn Asn Asp Phe Thr Ser Ser Ser Leu Gin He Glu Glu Asp Pro 

1005 1010 1015 

GTG GAC TTT GGG AAT TCC TGG AAA GTG AAC CCG CAG TGT GCC GAC ACC 3 3 04 

Val Asp Phe Gly Asn Ser Trp Lys Val Asn Pro Gin Cys Ala Asp Thr 
1020 1025 1030 

AAG AAA GTA CCA CTG GAC TCA TCC CCT GCC GTC TGC CAC AAC AAC ATC 3 3 52 

Lys Lys Val Pro Leu Asp Ser Ser Pro Ala Val Cys His Asn Asn He 
1035 1040 1045 1050 

ATG AAG CAG ACG ATG GTG GAT TCC TCC TGC AGG ATC CTC ACC AGT GAT 3 4 00 

Met Lys Gin Thr Met Val Asp Ser Ser Cys Arg He Leu Thr Ser Asp 
1055 1060 1065 

ATT TTC CAG GAC TGC AAC AGG CTG GTG GAC CCT GAG CCA TTC CTG GAC 3448 
He Phe Gin Asp Cys Asn Arg Leu Val Asp Pro Glu Pro Phe Leu Asp 
1070 1075 1080 

ATT TGC ATC TAC GAC ACT TGC TCC TGT GAG TCC ATT GGG GAC TGC ACC 34 96 

He Cys He Tyr Asp Thr Cys Ser Cys Glu Ser He Gly Asp Cys Thr 

1065 1090 1095 

TGC TTC TGT GAC ACC ATT GCT GCT TAC GCC CAC GTC TGT GCC CAG CAT 3 544 

Cys Phe Cys Asp Thr He Ala Ala Tyr Ala His Val Cys Ala Gin His 
1100 1105 1110 

GGC AAG GTG GTA GCC TGG AGG ACA GCC ACA TTC TGT CCC CAG AAT TGC 3 5 92 

Gly Lys Val Val Ala Trp Arg Thr Ala Thr Phe Cys Pro Gin Asn Cys 
1115 1120 1125 1130 

GAG GAG CGG AAT CTC CAC GAG AAT GGG TAT GAG TGT GAG TGG CGC TAT 3640 
Glu Glu Arg Asn Leu His Glu Asn Gly Tyr Glu Cys Glu Trp Arg Tyr 
1135 1140 H45 

AAC AGC TGT GCC CCT GCC TGT CCC ATC ACG TGC CAG CAC CCC GAG CCA 3686 
Asn Ser Cys Ala Pro Ala Cys Pro He Thr Cys Gin His Pro Glu Pro 
1150 1155 H60 

CTG GCA TGC OCT GTA CAG TGT GTT GAA GGT TGC CAT GCG CAC TGC CCT 3736 
Leu Ala Cys Pro Val Gin Cys Val Glu Gly Cys His Ala His Cys Pro 
1165 1170 1175 

CCA GGG AAA ATC CTG GAT GAG CTT TTG CAG ACC TGC ATC GAC CCT GAA 3784 
Pro Gly Lys lie Leu Asp Glu Leu Leu Gin Thr Cys He Asp Pro Glu 
1180 1185 1190 

GAC TGT CCT GTG TGT GAG GTG GCT GGT CGT CGC TTG GCC CCA GGA AAG 3832 
Asp Cys Pro Val Cys Glu Val Ala Gly Arg Arg Leu Ala Pro Gly Lys 
1195 1200 1205 1210 

AAA ATC ATC TTG AAC CCC AGT GAC CCT GAG CAC TGC CAA ATT TGT AAT 3880 
Lys He He Leu Asn Pro Ser Asp Pro Glu His Cys Gin He Cys Asn 
1215 1220 1225 

TGT GAT GGT GTC AAC TTC ACC TGT AAG GCC TGC AGA GAA CCC GGA AGT 3928 
Cys Asp Gly Val Asn Phe Thr Cys Lys Ala Cys Arg Glu Pro Gly Ser 
1230 1235 1240 

GTT GTG GTG CCC CCC ACA GAT GGC CCC ATT GGC TCT ACC ACC TCG TAT 3976 
Val Val Val Pro Pro Thr Asp Gly Pro He Gly Ser Thr Thr Ser Tyr 
1245 1250 1255 
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GTG GAG GAC ACG TCG GAG CCG CCC CTC CAT GAC TTC CAC 7GC AGO AGO 4 02 4 

Val Glu Asp Thr Ser Glu Pro Pro Leu His Asp Phe Hie Cys Ser Arg 

126C 1265 1270 

CTT CTG GAC CTG GTT TTC CTG CTG GAT GGC TCC TCC AAG CTG TCT GAG 4072 
Leu Leu Asp Leu Val Phe Leu Leu Asp Gly Ser Ser Lys Leu Ser Glu 
1275 1280 1265 1290 

GAC GAG TTT GAA GTG CTG AAG GTC TTT GTG GTG GGT ATG ATG GAG CAT 4120 
Asp Glu Phe Glu Val Leu Lys Val Phe Val Val Gly Met Met Glu His 
1295 1300 1305 

CTG CAC ATC TCC CAG AAG CGG ATC CGC GTG GCT GTG GTG GAG TAC CAC 416 8 

Leu His lie Ser Gin Lys Arg lie Arg Val Ala Val Val Glu Tyr His 
1310 1315 1320 

GAC GGC TCC CAC GCC TAC ATC GAG CTC AAG GAC CGG AAG CGA CCC TCA 4 216 

Asp Gly Ser His Ala Tyr lie Glu Leu Lys Asp Arg Lys Arg Pro Ser 
1325 1330 1335 

GAG CTG CGG CGC ATC ACC AGC CAG GTG AAG TAC GCG GGC AGC GAG GTG 4 264 

Glu Leu Arg Arg He Thr Ser Gin Val Lys Tyr Ala Gly Ser Glu Val 
1340 1345 1350 

GCC TCC ACC AGT GAG GTC TTA AAG TAC ACG CTG TTC CAG ATC TTT GGC 4 312 

Ala Ser Thr Ser Glu Val Leu Lys Tyr Thr Leu Phe Gin He Phe Gly 
1355 1360 1365 1370 

AAG ATC GAC CGC CCG GAA GCG TCT CGC ATT GCC CTG CTC CTG ATG GCC 4 360 

Lys He Asp Arg Pro Glu Ala Ser Arg He Ala Leu Leu Leu Met. Ala 
1375 1380 13B5 

AGC CAG GAG CCC TCA AGG CTG GCC CGG AAT TTG GTC CGC TAT GTG CAG 4408 
Ser Gin Glu Pro Ser Arg Leu Ala Arg Asn Leu Val Arg Tyr Val Gin 
1390 1395 1400 

GGC CTG AAG AAG AAG AAA GTC ATT GTC ATC CCT GTG GGC ATC GOG CCC 44 56 

Gly Leu Lys Lys Lys Lys Val He Val He Pro Val Gly He Gly Pro 
1405 1410 1415 

CAC GCC AGC CTT AAG CAG ATC CAC CTC ATA GAG AAG CAG GCC CCT GAG 4504 
His Ala Ser Leu Lys Gin He His Leu He Glu Lys Gin Ala Pro Glu 
1420 1425 1430 

AAC AAG GCC TTT GTG TTC AGT GGT GTG GAT GAG TTG GAG CAG CGA AGG 4552 
Asn Lys Ala Phe Val Phe Ser Gly Val Asp Glu Leu Glu Gin Arg Arg 
1435 1440 1445 1450 

GAT GAG ATT ATC AAC TAC CTC TGT GAC CTT GCC CCC GAA GCA CCT GCC 4600 
Asp Glu He He Asn Tyr Leu Cys Asp Leu Ala Pro Glu Ala Pro Ala 
1455 1460 1465 

CCT ACT CAG CAC CCC CCA ATG GCC CAG GTC ACQ GTG GGT TCG GAG CTG 464 6 

Pro Thr Gin His Pro Pro Met Ala Gin Val Thr Val Gly Ser Glu Leu 
1470 1475 1480 

TTG GGG GTT TCA TCT CCA GGA CCC AAA AGG AAC TCC ATG GTC CTG GAT 4696 
Leu Gly Val Ser Ser Pro Gly Pro Lys Arg Asn Ser Met Val Leu Asp 
1485 1490 1495 

rrrn ttt nrr rrc gaa nnn rrr- **~- ^-p- tr-"* »xr 



»vA AAA AGC AGG GAG TTC ATG GAG GAG GTG ATT CAG CGG ATG GAP rrrv: 
asi; Lyo Ser Arg Glu Phe Met Glu Glu Val He Gin Arg Met Asp Val 
1515 1520 1525 1530 
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GGC CAG GAC AGG ATC CAC GTC ACA GTG CTG CAG TAC TCG TAC ATG GTG 4 84 0 

Gly Gin Asp Arg lie His Val Thr Val Leu Gin Tyr Ser Tyr Met Val 
1535 1540 154B 

ACC GTG GAG TAC ACC TTC AGC GAG GCG CAG TCC AAG GGC GAG GTC CTA 4 88 8 

Thr Val Glu Tyr Thr Phe Ser Glu Ala Gin Ser Lys Gly Glu Val Leu 
1550 1555 1560 

CAG CAG GTG CGG GAT ATC CGA TAC CGG GGT GGC AAC AGG ACC AAC ACT 4 9 36 

Gin Gin Val Arg Asp He Arg Tyr Arg Gly Gly Asn Arg Thr Asn Thr 
1565 1570 1575 

GGA CTG GCC CTG CAA TAC CTG TCC GAA CAC AGC TTC TCG GTC AGC CAG 4 984 

Gly Leu Ala Leu Gin Tyr Leu Ser Glu His Ser Phe Ser Val Ser Gin 
1580 1585 1590 

GGG GAC CGG GAG CAG GTA CCT AAC CTG GTC TAC ATG GTC ACA GGA AAC 50 32 

Gly Asp Arg Glu Gin Val Pro Asn Leu Val Tyr Met Val Thr Gly Asn 
1595 1600 1605 i 6 10 



CCC GCT TCT GAT GAG ATC AAG CGG ATG CCT GGA GAC ATC CAG GTG GTG 
Pro Ala Ser Asp Glu He Lys Arg Met Pro Gly Asp He Gin Val Val 
1615 1620 1625 



GGT GCC AGG CCC GGA GCC TCG AAA GCG GTG GTT ATC CTA GTC ACA GAT 
Gly Ala Arg Pro Gly Ala Ser Lys Ala Val Val He Leu Val Thr Asp 

1790 1795 1800 



5080 



CCC ATC GGG GTG GGT CCA CAT GCC AAT GTG CAG GAG CTG GAG AAG ATT 512 8 

Pro He Gly Val Gly Pro His Ala Asn Val Gin Glu Leu Glu Lys He 
1630 1635 1640 

GGC TGG CCC AAT GCC CCC ATC CTC ATC CAT GAC TTT GAG ATG CTC CCT 5176 
Gly Trp Pro Asn Ala Pro He Leu He His Asp Phe Glu Met Leu Pro 
1645 1650 1655 

CGA GAG GCT CCT GAT CTG GTG CTA CAG AGG TGC TGC TCT GGA GAG GGG 5224 
Arg Glu Ala Pro Asp Leu Val Leu Gin Arg Cys Cys Ser Gly Glu Gly 
1660 1665 1670 

CTG CAG ATC CCC ACC CTC TCC CCC ACC CCA GAT TGC AGC CAG CCC CTG 52 72 

Leu Gin He Pro Thr Leu Ser Pro Thr Pro Asp Cys Ser Gin Pro Leu 
1675 1680 1685 1690 

GAT GTG GTC CTC CTC CTG GAT GGC TCT TCC AGC ATT CCA GCT TCT TAC 5320 
Asp Val Val Leu Leu Leu Asp Gly Ser Ser Ser lie Pro Ala Ser Tyr 
1695 1700 1705 

TTT GAT GAA ATG AAG AGC TTC ACC AAG GCT TTT ATT TCA AGA GCT AAT S368 
Phe Asp Glu Met Lys Ser Phe Thr Lys Ala Phe He Ser Arg Ala Asn 
1710 1715 1720 

ATA GGG CCC CGG CTC ACT CAA GTG TCG GTG CTG CAA TAT GGA AGC ATC 5416 
He Gly Pro Arg Leu Thr Gin Val Ser Val Leu Gin Tyr Gly Ser He 
172S 1730 1735 

ACC ACT ATC GAT GTG CCT TGG AAT GTA GCC TAT GAG AAA GTC CAT TEA 5464 
Thr Thr He Asp Val Pro Trp Aan Val Ala Tyr Glu Lys Val His Leu 
17*0 1745 1750 

CTG AGC CTT GTG GAC CTC ATG CAG CAG GAG GGA GGC CCC AGC GAA ATT 5512 
Leu Ser Leu Val Asp Leu Met Gin Gin Glu Gly Gly Pro Ser Glu He 
1755 1760 1765 1770 

GGG GAT GCT TTG AGC TTT GCC GTG CGA TAT GTC ACC TCA GAA GTC CAT 5560 
Gly Asp Ala Leu Ser Phe Ala Val Arg Tyr Val Thr Ser Glu Val His 
1775 1780 1785 
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GTC TCC GTG GAT TCA GTG GAT GCT GCA GCC GAG GCC GCC AGA TCC AAC 56 56 

Val Ser Val Asp Ser Val Asp Ala Ala Ala Glu Ala Ala Arg Ser Asn 
1805 1810 1815 

CGA GTG ACA GTG TTC CCC ATT GGA ATC GGG GAT CGG TAC AGT GAG GCC 5 704 

Arg Val Thr Val Phe Pro lie Gly lie Gly Asp Arg Tyr Ser Glu Ala 
1820 1825 1830 

CAG CTG AGC AGC TTG GCA GGC CCA AAG GCT GGC TCC AAT ATG GTA AGG 5752 
Gin Leu Ser Ser Leu Ala Gly Pro Lys Ala Gly Ser Asn Met Val Arg 
1835 1840 1845 1850 

CTC CAG CGA ATT GAA GAC CTC CCC ACC GTG GCC ACC CTG GGA AAT TCC 5 8 00 

Leu Gin Arg lie Glu Asp Leu Pro Thr Val Ala Thr Leu Gly Asn Ser 
1855 i860 1865 

TTC TTC CAC AAG CTG TGC TCT GGG TTT GAT AGA GTT TGC GTG GAT GAG 5 84B 

Phe Phe His Lys Leu Cys Ser Gly Phe Asp Arg Val Cys Val Asp Glu 
1870 1875 i860 

GAT GGG AAT GAG AAG AGG CCC GGG GAT GTC TGG ACC TTG CCA GAC CAG 5 8 96 

Asp Gly Asn Glu Lys Arg Pro Gly Asp Val Trp Thr Leu Pro Asp Gin 
1885 1890 1695 

TGC CAC ACA GTG ACT TGC CTG CCA GAT GGC CAG ACC TTG CTG AAG AGT 5 94 4 

Cys His Thr Val Thr Cys Leu Pro Asp Gly Gin Thr Leu Leu Lys Ser 
1900 1905 1910 

CAT CGG GTC AAC TGT GAC CGG GGG CCA AGG CCT TCG TGC CCC AAT GGC 5992 
His Arg Val Asn Cys Asp Arg Gly Pro Arg Pro Ser Cys Pro Asn Gly 
1915 1920 1925 1930 

CAG CCC CCT CTC AGG GTA GAG GAG ACC TGT GGC TGC CGC TGG ACC TGT 6040 
Gin Pro Pro Leu Arg Val Glu Glu Thr Cys Gly Cys Arg Trp Thr Cys 
1935 1940 1945 

CCC TGT GTG TGC ATG GGC AGC TCT ACC CGG CAC ATC GTG ACC TTT GAT 6088 
Pro Cys Val Cys Met Gly Ser Ser Thr Arg His lie Val Thr Phe Asp 
1950 1955 i960 

GGG CAG AAT TTC AAG CTG ACT GGC AGC TGT TCG TAT GTC CTA TTT CAA 6136 
Gly Gin Asn Phe Lys Leu Thr Gly Ser Cys Ser Tyr Val Leu Phe Gin 
1965 1970 1975 

AAC AAG GAG CAG GAC CTG GAG GTG ATT CTC CAG AAT GGT GCC TGC AGC 6184 
Asn Lys Glu Gin Asp Leu Glu Val He Leu GLn Asn Gly Ala Cys Ser 
1980 X9B5 1990 

CCT GGG GCG AAG GAG ACC TGC ATG AAA TOC ATT GAG GTG AAG CAT GAC 6232 
Pro Gly Ala Lys Glu Thr Cys Met Lys Ser He Glu Val Lyc Hib Asp 
1995 2000 2005 2010 

GGC CTC TCA GTT GAG CTC CAC AGT GAC ATG CAG ATG ACA GTG AAT GOG 6280 
Gly Leu Ser Val Glu Leu His Ser Asp Met Gin Met Thr Val Asn Gly 
2015 2020 2025 

AGA CTA GTC TCC ATC CCA TAT GTG GGT GGA GAC ATG GAA GTC AAT GTT 632B 
Arg Leu Val Ser He Pro Tyr Val Gly Gly Asp Met Glu Val Asn Val 
2030 2035 2040 

TAT GGG ACC ATC ATC TAT r,AH crrr j&p TT*~ r,,v — ~ — — 



A^> C'J... CAA AAC AAT GAG TTC CAG CTT. PAG CTC ACC 
"he .hi Phe ;nr I ro Gin Asn Asn Glu Phe Gin Leu Gin Leu Ser Pro 
2060 2065 2070 
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AGG ACC TTT GOT TCG AAG ACA TAT GGT CTC TGT GGG ATC TGT GAT GAG 64 72 

Arg Thr Phe Ala Ser Lys Thr Tyr Gly Leu Cys Gly He Cys Asp Glu 
2075 2080 208S 2090 

AAC GGA GCC AAT GAC TTC ATT CTG AGG GAT GGG ACA GTC ACC ACA GAC 6 52 0 

Asn Gly Ala Asn Asp Phe He Leu Arg Asp Gly Thr Val Thr Thr Asp 
2095 2100 2105 

TGG AAG GCA CTC ATC CAG GAA TGG ACC GTA GAG CAG CTT GGG AAG ACA 6 56 8 

Trp Lys Ala Leu He Gin Glu Trp Thr Val Gin Gin Leu Gly Lys Thr 
2110 2115 2120 

TCC CAG CCT GTC CAT GAG GAG CAG TGT CCT GTC TCC GAA TTC TTC CAC 6 616 

Ser Gin Pro Val His Glu Glu Gin Cys Pro Val Ser Glu Phe Phe His 

2125 2130 2135 



TGC CAG GTC CTC CTC TCA GAA TTG TTT GCC GAG TGC CAC AAG GTC CTC 
Cys Gin Val Leu Leu Ser Glu Leu Phe Ala Glu Cys His Lys Val Leu 
2140 2145 2150 



AAA GGG GTC TGT GTG GAC TGG AGG AGG GCC AAT TTC TGT GCT ATG TCA 
Lys Gly Val Cys Val Asp Trp Arg Arg Ala Asn Phe Cys Ala Met Ser 
2190 2195 2200 



6664 



GCT CCA GCC ACC TTT TAT GCC ATG TGC CAG CCC GAC AGT TGC CAC CCG 6 712 

Ala Pro Ala Thr Phe Tyr Ala Met Cys Gin Pro Asp Ser Cys His Pro 
2155 2160 2165 2170 

AAG AAA GTG TGT GAG GCG ATT GCC TTG TAT GCC CAC CTC TGT CGG ACC 6 76 0 

Lys Lys Val Cys Glu Ala He Ala Leu Tyr Ala His Leu Cys Arg Thr 

2175 2180 2185 



6608 



TGT CCA CCA TCC CTG GTG TAC AAC CAC TGT GAG CAT GGC TGC CCT CGG 6856 
Cys Pro Pro Ser Leu Val Tyr Asn His Cys Glu His Gly Cy* Pro Arq 
2205 2210 2215 

CTC TGT GAA GGC AAT ACA AGC TCC TGT GGG GAC CAA CCC TCG GAA GGC 6 904 

Leu Cys Glu Gly Asn Thr Ser Ser Cys Gly Asp Gin Pro Ser Glu Gly 
2220 2225 2230 

TGC TTC TGC CCC CCA AAC CAA GTC ATG CTG GAA GGT AGC TGT GTC CCC 6952 
Cys Phe Cys Pro Pro Asn Gin Val Met Leu Glu Gly Ser Cys Val Pro 
2235 2240 2245 2250 

GAG GAG GCC TGT ACC CAG TGC ATC AGC GAG GAT GGA GTC CGG CAC CAG 7000 
Glu Glu Ala Cys Thr Gin Cys He Ser Glu Asp Gly Val Arg His Gin 
2255 2260 2265 

TTC CTG GAA ACC TOG GTC CCA GCC CAC CAG CCT TGC CAG ATC TGC ACG 7048 
Phe Leu Glu Thr Trp Val Pro Ala His Gin Pro Cys Gin He Cys Thr 
2270 2275 2280 

TGC CTC AGT GGG CGG AAG GTC AAC TGT ACG TTG CAG CCC TGC CCC ACA 7096 
Cys Leu Ser Gly Arg Lys Val Asn Cys Thr Leu Gin Pro Cys Pro Thr 
2285 2290 2295 

GCC AAA GCT CCC ACC TGT GGC CCG TGT GAA GTG GCC CGC CTC CGC CAG 7144 
Ala Lys Ala Pro Thr Cys Gly Pro Cys Glu Val Ala Arg Leu Arg Gin 
2300 2305 2310 

AAC GCA GTG CAG TGC TGC CCG GAG TAC GAG TGT GTG TGT GAC CTG GTG 7192 
Asn Ala Val Gin Cys Cys Pro Glu Tyr Glu Cys Val Cys Asp Leu Val 
2315 2320 2325 2330 

AGC TGT GAC CTG CCC CCG GTG CCT CCC TGC GAA GAT GGC CTC CAG ATG 7240 
Ser Cys Asp Leu Pro Pro Val Pro Pro Cys Glu Asp Gly Leu Gin Met 
2335 2340 2 345 



WO 98/03683 



PCT7US97/12606 



ACC CTG ACC AAT CCT GGC GAG TGC AGA CCC AAC TTC ACC TGT GCC TGC 
Thr Leu Thr Asn Pro Gly Glu Cys Arg Pro Asn Phe Thr Cys Ala Cys 
2350 2355 2360 



AGG AAG GAT GAA TGC AGA CGG GAG TCC CCG CCC TCT TGT CCC CCG CAC 7 3 36 

Arg Lys Asp Glu Cys Arg Arg Glu Ser Pro Pro Ser Cys l>ro Pro His 
2365 2370 2375 

CGG ACG CCG GCC CTT CGG AAG ACT CAG TGC TGT GAT GAG TAT GAG TGT 73 6 4 

Arg Thr Pro Ala Lreu Arg Lys Thr Gin Cys Cys Asp Glu Tyr Glu Cys 
2380 2385 2390 

GCA TGC AAC TGT GTC AAC TCC ACG GTG AGC TGC CCG CTT GGG TAC CTG 74 3 2 

Ala Cys Asn Cys Val Asn Ser Thr Val Ser Cys Pro Leu Gly Tyr Leu 
2395 2400 2405 2410 

GCC TCG GCT GTC ACC AAC GAC TGT GGC TGC ACC ACA ACA ACC TGC TTC 7 4 80 

Ala Ser Ala Val Thr Asn Asp Cys Gly Cys Thr Thr Thr Thr Cys Phe 
2415 2420 2425 

CCT GAC AAG GTG TGT GTC CAC CGA GGC ACC ATC TAC CCT GTG GGC CAG 7 52 8 

Pro Asp Lys Val Cys Val His Arg Gly Thr lie Tyr Pro Val Gly Gin 
2430 2435 2440 

TTC TGG GAG GAG GCC TGT GAC GTG TGC ACC TGC ACG GAC TTG GAG GAC 7 5 76 

Phe Trp Glu Glu Ala Cys Asp Val Cys Thr Cys Thr Asp Leu Glu Asp 
2445 2450 2455 

TCT GTG ATG GGC CTG CGT GTG GCC CAG TGC TCC CAG AAG CCC TGT GAG 76 2 4 

Ser Val Met Gly Leu Arg Val Ala Gin Cys Ser Gin Lys Pro Cys Glu 
2460 2465 2470 

GAC AAC TGC CTG TCA GGC TTC ACT TAT GTC CTT CAT GAA GGC GAG TGC 7672 
Asp Asn Cys Leu Ser Gly Phe Thr Tyr Val Leu His Glu Gly Glu Cys 
2475 2480 2485 2490 

TGT GGA AGG TGT CTG CCA TCT GCC TGT GAG GTG GTC ACT GGT TCA CCA 7720 
Cys Gly Arg Cys Leu Pro Ser Ala Cys Glu Val Val Thr Gly Ser Pro 
249S 2500 2505 

CGG GGC GAC GCC CAG TCT CAC TGG AAG AAT GTT GGC TCT CAC TGG GCC 7768 
Arg Gly Asp Ala Gin Ser His Trp Lys Asn Val Gly Ser His Trp Ala 
2S10 2515 2520 

TCC CCT GAC AAC CCC TGC CTC ATC AAT GAG TGT GTC CGA GTG AAG GAA 7816 
Ser Pro Asp Asn Pro Cys Leu He Asn Glu Cys Val Arg Val Lys Glu 
2525 2530 2535 

GAG GTC TTT GTG CAA CAG AGG AAT GTC TCC TGC CCC CAG CTG AAT GTC 7864 
Glu Val Phe Val Gin Gin Arg Asn Val Ser Cys Pro Gin Leu Asn Val 
2540 2545 2550 

CCC ACC TGC CCC ACG GGC TTC CAG CTG AGC TGT AAG ACC TCA GAG TGT 7912 
Pro Thr Cys Pro Thr Gly Phe Gin Leu Ser Cys Lya Thr Ser Glu Cys 
2555 2560 2565 2570 

TGT CCC ACC TGT CAC TGC GAG CCC CTG GAG GCC TGC TTG CTC AAT GGT 7960 
Cys Pro Thr Cys His Cys Glu Pro Leu Glu Ala Cys Leu Leu Asn Gly 
2575 2580 2565 

AC ATT AT^ GGC PC RAt rs- E — * — • . . 



AL\.. 

Thr 



2605 



JCLi «m , tr.,/\ CTT';.; ATC TCT GGA TTT' AAG CTG CA.., 
Pro Vdi Giy Vai He Ser Gly Phe Lys Leu Glu 
2610 2615 
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GGC AGG AAG ACC ACC TGT GAG GCA TGC CCC CTG GGT TAT AAG GAA GAG 8104 

Gly Arg Lys Thr Thr Cys Glu Ala Cys Pro Leu Gly Tyr Lys Glu Glu 
2620 2625 2630 

AAG AAC CAA GGT GAA TGC TGT GGG AGA TGT CTG CCT ATA GCT TGC ACC 8152 

Lys Asn Gin Gly Glu Cys Cys Gly Arg Cys Leu Pro He Ala Cys Thr 
2635 2640 2645 2650 



ATT CAG CTA AGA GGA GGA CAG ATC ATG ACA CTG AAG CGT GAT GAG ACT 
He Gin Leu Arg Gly Gly Gin He Met Thr Leu Lys Arg Asp Glu Thr 
2655 2660 2665 



ATT CAT TAC TGT GAG GGT AAA TGT GCC AGC AAA GCC GTG TAC TCC ATC 
He His Tyr Cys Glu Gly Lys Cys Ala Ser Lys Ala Val Tyr Ser He 
2750 2755 2760 



8200 



ATC CAG GAT GGC TGT GAC AGT CAC TTC TGC AAG GTC AAT GAA AGA GGA 824 B 

He Gin Asp Gly Cys Asp Ser His Phe Cys Lys Val Asn Glu Arg Gly 
2670 2675 2680 

GAG TAC ATC TGG GAG AAG AGA GTC ACG GGT TGC CCA CCT TTC GAT GAA 82 96 

Glu Tyr He Trp Glu Lys Arg Val Thr Gly Cys Pro Pro Phe Asp Glu 
2685 2690 2695 

CAC AAG TGT CTG GCT GAG GGA GGA AAA ATC ATG AAA ATT CCA GGC ACC 8 34 4 

His Lys Cys Leu Ala Glu Gly Gly Lys He Met Lys He Pro Gly Thr 
2700 2705 2710 

TGC TGT GAC ACA TGT GAG GAG CCA GAA TGC AAG GAT ATC ATT GCC AAG 8 3 92 

Cys Cys Asp Thr Cys Glu Glu Pro Glu Cys Lys Asp He He Ala Lys 

2">15 2720 2725 2730 

CTG CAG CGT GTC AAA GTG GGA GAC TGT AAG TCT GAA GAG GAA GTG GAC 8440 
Leu Gin Arg Val Lys Val Gly Asp Cys Lys Ser Glu Glu Glu Val Asp 
2735 2740 2745 



8488 



CAC ATG GAG GAT GTG CAG GAC CAG TGC TCC TGC TGC TCG CCC ACC CAG 8536 
His Met Glu Asp Val Gin Asp Gin Cys Ser Cys Cys Ser Pro Thr Gin 
2765 2770 2775 

ACG GAG CCC ATG CAG GTG GCC CTG CGC TGC ACC AAT GGC TCC CTC ATC 8584 
Thr Glu Pro Met Gin Val Ala Leu Arg Cys Thr Asn Gly Ser Leu He 
2780 2785 2790 

TAC CAT GAG ATC CTC AAT GCC ATC GAA TGC AGG TGT TCC CCC AGG AAG 8632 
Tyr His Glu He Leu Asn Ala He Glu Cys Arg Cys Ser Pro Ara Lvs 

2795 2800 2805 2810 

TGC AGC AAG TGAGGCCACT GCCTOGATGC TACTGTCGCC TGCCTTACCC 86B1 
Cys Ser Lys 

GACXTTCACTG GACTGGCCAG AGTGCTGCTC AGTCCTCCTC AGTCCTCCTC CTGCTCTGCT 8741 

CTTOTGCTTC CTGATCCCAC AATAAAGGTC AATCTTTCAC CTTGAAAAAA AAAAAAAAAA 8801 

A 8B02 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2813 amino acids 
(BJ TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE : protein 
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ixi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ser Pre Thr Arg Leu Val Arg Val Leu Leu Ala Leu Ala Leu He 
15 10 IS 

Leu Pro Gly Lys Leu Cys Thr Lys Gly Thr Val Gly Arg Ser Ser Met 
20 25 30 

Ala Arg Cys Ser Leu Leu Gly Gly Asp Phe lie Asn Thr Phe Asp Glu 
35 40 45 

Ser Met Tyr Ser Phe Ala Gly Asp Cys Ser Tyr Leu Leu Ala Gly Asp 

50 55 60 

Cys Gin Glu His Ser He Ser Leu He Gly Gly Phe Gin Asn Asp Lys 
65 70 75 80 

Arg Val Ser Leu Ser Val Tyr Leu Gly Glu Phe Phe Asp lie His Leu 
85 90 95 

Phe Val Asn Gly Thr Met Leu Gin Gly Thr Gin Ser He Ser Met Pro 
100 105 110 

Tyr Ala Ser Asn Gly Leu Tyr Leu Glu Ala Glu Ala Gly Tyr Tyr Lys 
115 120 125 

Leu Ser Ser Glu Ala Tyr Gly Phe Val Ala Arg He Asp Gly Asn Gly 
130 ' 135 140 

Asn Phe Gin Val Leu Leu Ser Asp Arg Tyr Phe Asn Lys Thr Cys Gly 
145 150 155 160 

Leu Cys Gly Asn Phe Asn He Phe Ala Glu Asp Asp Phe Lys Thr Gin 
165 170 175 

Glu Gly Thr Leu Thr Ser Asp Pro Tyr Asp Phe Ala Asn Ser Trp Ala 
180 185 190 

Leu Ser Ser Gly Glu Gin Arg Cys Lys Arg Val Ser Pro Pro Ser Ser 
195 200 205 

Pro Cys Asn Val Ser Ser Asp Glu Val Gin Gin Val Leu Trp Glu Gin 
210 215 220 

Cys Gin Leu Leu Lys Ser Ala Ser Val Phe Ala Arg Cys His Pro Leu 
225 230 235 240 

Val Asp Pro Glu Pro Phe Val Ala Leu Cys Glu Arg Thr Leu Cys Thr 
245 250 255 

Cys Val Gin Gly Met Glu Cys Pro Cys Ala Val Leu Leu Glu Tyr Ala 
260 265 270 

Arg Ala Cys Ala Gin Gin Gly lie Val Leu Tyr Gly Trp Thr Asp His 
275 280 285 

Ser Val Cys Arg Pro Ala Cys Pro Ala Gly Met Glu Tyr Lys Glu Cys 
290 295 300 

Val Ser Pro Cys Thr Arg Thr Cys Gin Ser Leu His Val Lys Glu Val 

305 310 11^ t?o 



345 



cys Sci 



Cyti Vai Hie 
350 



340 
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Ala Gly Gin Arg Tyr Pro Pro Gly Ala Ser Leu Leu Gin Asp Cys His 
355 360 365 

Thr Cys He Cys Arg Asn Ser Leu Trp He Cys Ser Asn Glu Glu Cys 
370 37S 380 

Pro Gly Glu Cys Leu Val Thr Gly Gin Ser His Phe Lys Ser Phe Asp 
365 390 39S 400 

Asn Arg Tyr Phe Thr Phe Ser Gly Val Cys His Tyr Leu Leu Ala Gin 
405 410 415 

Asp Cys Gin Asp His Thr Phe Ser Val Val He Glu Thr Val Gin Cys 
420 425 430 

Ala Asp Asp Leu Asp Ala Val Cys Thr Arg Ser Val Thr Val Arg Leu 
435 440 445 

Pro Gly His His Asn Ser Leu Val Lys Leu Lys Asn Gly Gly Gly Val 
450 455 460 

Ser Met Asp Gly Gin Asp He Gin He Pro Leu Leu Gin Gly Asp Leu 
465 470 475 480 

Arg He Gin His Thr Val Met Ala Ser Val Arg Leu Ser Tyr Gly Glu 
485 490 495 

Asp Leu Gin Met Asp Ser Asp Val Arg Gly Arg Leu Leu Val Thr Leu 
S0O 505 510 

Tyr Pro Ala Tyr Ala Gly Lys Thr Cys Gly Arg Gly Gly Asn Tyr Asn 
515 520 525 

Gly Asn Arg Gly Asp Asp Phe Val Thr Pro Ala Gly Leu Ala Glu Pro 
530 535 540 

Leu Val Glu Asp Phe Gly Asn Ala Trp Lys Leu Leu Gly Ala Cys Glu 
545 550 555 560 

Asn Leu Gin Lys Gin His Arg Asp Pro Cys Ser Leu Asn Pro Arg Gin 
565 570 575 

Ala Arg Phe Ala Glu Glu Ala Cys Ala Leu Leu Thr Ser Ser Lys Phe 
580 585 590 

Glu Pro Cys His Arg Ala Val Gly Pro Gin Pro Tyr Val Gin Asn Cys 
595 600 605 

Leu Tyr Asp Val Cys Ser Cys Ser Asp Gly Arg Asp Cys Leu Cys Ser 
610 615 620 

Ala Val Ala Asn Tyr Ala Ala Ala Val Ala Arg Arg Gly Val His He 
625 630 635 640 

Ala Trp Arg Glu Pro Gly Phe Cys Ala Leu Ser Cys Pro Gin Gly Gin 
645 650 655 

Val Tyr Leu Gin Cys Gly Thr Pro Cys Asn Met Thr Cys Leu Ser Leu 
660 665 670 

Ser Tyr Pro Glu Glu Asp Cys Asn Glu Val Cys Leu Glu Ser Cys Phe 
675 680 685 

Ser Pro Pro Gly Leu Tyr Leu Asp Glu Arg Gly Asp Cys Val Pro Lys 
690 695 700 
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Ala Gin Cys Pro Cys Tyr Tyr Asp Gly Glu He Phe Gin Pro Glu Asp 
705 710 715 720 

He Phe Ser Asp His His Thr Met Cys Tyr Cys Glu Asp Gly Phe Met 
725 730 735 

His Cys Thr Thr Ser Gly Gly Leu Gly Ser Leu Leu Pro Asn Pro Val 
740 745 750 

Leu Ser Ser Pro Arg Cys His Arg Ser Lys Arg Ser Leu Ser Cys Arg 
755 760 765 

Pro Pro Met Val Lys Leu Val Cys Pro Ala Asp Asn Pro Arg Ala Glu 
770 775 780 

Gly Leu Glu Cys Ala Lys Thr Cys Gin Asn Tyr Asp Leu Gin Cys Met 
785 790 795 800 

Ser Thr Gly Cys Val Ser Gly Cys Leu Cvs Pro Gin Gly Met Val Arn 
805 810 * 815 

His Glu Asn Arg Cys Val Ala Leu Glu Arg Cys Pro Cys Phe His Gin 
820 825 830 

Gly Gin Glu Tyr Ala Pro Gly Glu Thr Val Lys He Asp Cys Asn Thr 
835 840 845 

Cys Val Cys Arg Asp Arg Lys Trp Thr Cys Thr Asp His Val Cys Asp 
850 855 860 

Ala Thr Cys Ser Ala He Gly Met Ala His Tyr Leu Thr Phe Asp Gly 
865 870 875 680 

Leu Lys Tyr Leu Phe Pro Gly Glu Cys Gin Tyr Val Leu Val Gin Asp 
885 890 895 

Tyr Cys Gly Ser Asn Pro Gly Thr Leu Arg He Leu Val Gly Asn Glu 
900 905 910 

Gly Cys Ser Tyr Pro Ser Val Lys Cys Lys Lys Arg Val Thr He Leu 
915 920 925 

Val Glu Gly Gly Glu He Glu Leu Phe Asp Glv Glu Val Asn Val Lys 
930 935 * 940 

Lys Pro Met Lys Asp Glu Thr His Phe Glu Val Val Glu Ser Gly Gin 
945 950 955 960 

Tyr Val He Leu Leu Leu Gly Lys Ala Leu Ser Val Val Trp Asp His 
965 970 975 

Arg Leu Ser He Ser Val Thr Leu Lys Arg Thr Tyr Gin Glu Gin Val 
980 985 990 

Cys Gly Leu Cys Gly Asn Phe Asp Gly He Gin Aan Asn Asp Phe Thr 
995 1000 1005 



Ser Ser Ser Leu Gin He Glu Glu Asp Pro Val Asp Phe Gly Asn Ser 

1010 1015 1020 
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Asp Ser Ser Cys Arg lie Leu Thr Ser Asp lie Phe Gin Asp Cys Asn 
1060 1065 1070 

Arg Leu Val Asp Pro Glu Pro Phe Leu Asp lie Cys lie Tyr Asp Thr 
1075 1080 1085 

Cys Ser Cys Glu Ser lie Gly Asp Cys Thr Cys Phe Cys Asp Thr lie 
1090 1095 1100 

Ala Ala Tyr Ala His Val Cye Ala Gin His Gly Lys Val Val Ala Trp 
1105 1110 1115 1120 

Arg Thr Ala Thr Phe Cys Pro Gin Asn Cys Glu Glu Arg Asn Leu His 
1125 1130 H35 

Glu Asn Gly Tyr Glu Cys Glu Trp Arg Tyr Asn Ser Cys Ala Pro Ala 
1140 1145 1150 

Cys Pro lie Thr Cys Gin His Pro Glu Pro Leu Ala Cys Pro Val Gin 
1155 1160 1165 

Cys Val Glu Gly Cys His Ala His Cys Pro Pro Gly Lys lie Leu Asp 
1170 1175 1180 

Glu Leu Leu Gin Thr Cys lie Asp Pro Glu Asp Cys Pro Val Cys Glu 
1185 1190 1195 1200 

Val Ala Gly Arg Arg Leu Ala Pro Gly Lys Lys lie lie Leu Asn Pro 
1205 1210 1215 

Ser Asp Pro Glu His Cys Gin He Cys Asn Cys Asp Gly Val Asn Phe 
1220 1225 1230 

Thr Cys Lys Ala Cys Arg Glu Pro Gly Ser Val Val Val Pro Pro Thr 
1235 1240 1245 

Asp Gly Pro He Gly Ser Thr Thr Ser Tyr Val Glu Asp Thr Ser Glu 
1250 1255 1260 

Pro Pro Leu His Asp Phe His Cys Ser Arg Leu Leu Asp Leu Val Phe 
1265 1270 1275 1280 

Leu Leu Asp Gly Ser Ser Lys Leu Ser Glu Asp Glu Phe Glu Val Leu 
1285 1290 1295 

Lys Val Phe Val Val Gly Met Met Glu His Leu His He Ser Gin Lys 
1300 1305 1310 

Arg He Arg Val Ala Val Val Glu Tyr His Asp Gly Ser His Ala Tyr 
1315 1320 1325 

He Glu Leu Lys Asp Arg Lys Arg Pro Ser Glu Leu Arg Arg He Thr 
1330 1335 1340 

Ser Gin Val Lys Tyr Ala Gly Ser Glu Val Ala Ser Thr Ser Glu Val 
1345 1350 1355 1360 

Leu Lys Tyr Thr Leu Phe Gin He Phe Gly Lys He Asp Arg Pro Glu 
1365 1370 1375 

Ala Ser Arg He Ala Leu Leu Leu Met Ala Ser Gin Glu Pro Ser Arg 
1380 1385 1390 

Leu Ala Arg Asn Leu Val Arg Tyr Val Gin Gly Leu Lys Lys Lys Lys 
1395 1400 1405 
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Val He Val He Pro Val Gly He Gly Pro His Ala Ser Leu Lys Gin 
1410 1415 1420 

He His Leu He Glu Lys Gin Ala Pro Glu Asn Lys Ala Phe Val Phe 
1425 1430 1435 1440 

Ser Gly Val Asp Glu Leu Glu Gin Arg Arg Asp Glu He He Asn Tyr 
1445 1450 1455 

Leu Cys Asp Leu Ala Pro Glu Ala Pro Ala Pro Thr Gin His Pro Pro 
1460 1465 1470 

Met Ala Gin Val Thr Val Gly Scr Glu Leu Leu Gly Val Ser Ser Pro 
1475 1480 148S 

Gly Pro Lys Arg Asn Ser Met Val Leu Asp Val Val Phe Val Leu Glu 
1490 1495 1500 

Gly Ser Asp Lys He Gly Glu Ala Asn Phe Asn Lys Ser Arg Glu Phe 
1505 1510 lblb 1520 

Met Glu Glu Val He Gin Arg Met Asp Val Gly Gin Asp Arg He His 
1525 1530 1535 

Val Thr Val Leu Gin Tyr Ser Tyr Met Val Thr Val Glu Tyr Thr Phe 
1540 1545 1550 

Ser Glu Ala Gin Ser Lys Gly Glu Val Leu Gin Gin Val Arg Asp He 
1555 1560 1565 

Arg Tyr Arg Gly Gly Asn *Vrg Thr Asn Thr Gly Leu Ala Leu Gin Tyr 
1570 -575 1580 

Leu Ser Glu His Ser Phe Ser Val Ser Gin Gly Asp Arg Glu Gin Val 
1585 1590 1595 1600 

Pro Asn Leu Val Tyr Met Val Thr Gly Asn Pro Ala Ser Asp Glu He 
1605 1610 1615 

Lys Arg Met Pro Gly Asp He Gin Val Val Pro He Gly Val Gly Pro 
1620 1625 1630 

His Ala Asn Val Gin Glu Leu Glu Lys He Gly Trp Pro Asn Ala Pro 
1635 1640 1645 

He Leu He His Asp Phe Glu Met Leu Pro Arg Glu Ala Pro Asp Leu 
1650 1655 1660 

Val Leu Gin Arg Cys Cys Ser Gly Glu Gly Leu Gin He Pro Thr Leu 
1665 1670 1675 1680 

Ser Pro Thr Pro Asp Cys Ser Gin Pro Leu Asp Val Val Leu Leu Leu 
1685 1690 1695 

Asp Gly Ser Ser Ser He Pro Ala Ser Tyr Phe Asp Glu Met Lys Ser 
1700 1705 1710 

Phe Thr Lys Ala Phe He Ser Arg Ala Asn He Gly Pro Arg Leu Thr 
1715 1720 1725 



Gin Val Ser Val Leu Gin Tyr Glv P^r T 1 T*- - *r*h- T "! n rw 
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Met Gin Gin Glu Gly Gly Pro Ser Glu lie Gly Asp Ala Leu Ser Phe 
1765 1770 1775 

Ala Val Arg Tyr Val Thr Ser Glu Val His Gly Ala Arc Pro Gly Ala 
1780 1785 " 1790 

Ser Lys Ala Val Val lie Leu Val Thr Asp Val Ser Val Asp Ser Val 
1795 1800 1805 

Asp Ala Ala Ala Glu Ala Ala Arg Ser Asn Arg Val Thr Val Phe Pro 
1810 1815 1820 

He Gly He Gly Asp Arg Tyr Ser Glu Ala Gin Leu Ser Ser Leu Ala 
1825 1830 1835 1840 

Gly Pro Lys Ala Gly Ser Asn Met Val Arg Leu Gin Arg He Glu Asp 
1845 1850 1855 

Leu Pro Thr Val Ala Thr Leu Gly Asn Ser Phe Phe His Lys Leu Cys 
1860 1865 1870 

Ser Gly Phe Asp Arg Val Cys Val Asp Glu Asp Gly Asn Glu Lys Arg 
1875 1880 1885 

Pre Gly Asp Val Trp Thr Leu Pro Asp Gin Cys His Thr Val Thr Cys 
1890 1895 1900 

Leu Pro Asp Gly Gin Thr Leu Leu Lys Ser His Arg Val Asn Cys Asp 
1905 1910 1915 1920 

Arg Gly Pro Arg Pro Ser Cys Pro Asn Gly Gin Pro Pro Leu Arg Val 
1925 1930 1935 

Glu Glu Thr Cys Gly Cys Arg Trp Thr Cys t> ro Cys Val Cys Met Gly 
1940 1945 1950 

Ser Ser Thr Arg His lie Val Thr Phe Asp Gly Gin Asn Phe Lys Leu 
19S5 I960 1965 

Thr Gly Ser Cys Ser Tyr Val Leu Phe Gin Asn Lys Glu Gin Asp Leu 
1970 1975 1980 

Glu Val He Leu Gin Asn Gly Ala Cys Ser Pro Gly Ala Lys Glu Thr 
!98S 1990 1995 2000 

Cys Met Lys Ser He Glu Val Lys His Asp Gly Leu Ser Val Glu Leu 
2005 2010 2015 

His Ser Asp Met Gin Met Thr Val Asn Gly Arg Leu Val Ser He Pro 
2020 2025 2030 

Tyr Val Gly Gly Asp Met Glu Val Asn Val Tyr Gly Thr He Met Tyr 
2035 2040 2045 

Glu Val Arg Phe Asn His Leu Qly His He Phe Thr Phe Thr Pro Gin 
2050 2055 2060 

Asn Asn Glu Phe Gin Leu Gin Leu Ser Pro Arg Thr Phe Ala Ser Lys 
2065 2070 2075 2080 

Thr Tyr Gly Leu Cys Gly He Cys Asp Glu Asn Gly Ala Asn Asp Phe 
2085 2090 2095 

He Leu Arg Asp Gly Thr Val Thr Thr Asp Trp Lys Ala Leu He Gin 
2100 2105 2110 
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Glu Trp Thr Val Gin Gin Leu Gly Lys Thr Ser Gin Pro Val His Glu 
2115 2120 2125 

Glu Gin Cys Pro Val Ser Glu Phe Phe His Cys Gin Val Leu Leu Ser 
2130 2135 2140 

Glu Leu Phe Ala Glu Cys His Lys Val Leu Ala Pro Ala Thr Phe Tyr 
2145 2150 2155 216C 

Ala Met Cys Gin Pro Asp Ser Cys His Pro Lys Lys Val Cys Glu Ala 
2165 2170 2175 

He Ala Leu Tyr Ala His Leu Cys Arg Thr Lys Gly Val Cys Val Asp 
2180 21B5 2190 

Trp Arg Arg Ala Asn Phe Cys Ala Met Ser Cys Pro Pro Ser Leu Val 
2195 2200 2205 

Tyr Asn His Cys Glu His Gly Cys Pro Arg Leu Cys Glu Gly Asn Thr 
2210 Z^lb 2220 

Ser Ser Cys Gly Asp Gin Pro Ser Glu Gly Cys Phe Cys Pre Pro Asr. 
2225 2230 2235 2240 

Gin Val Met Leu Glu Gly Ser Cys Val Pro Glu Glu Ala Cys Thr Gin 
2245 2250 2255 

Cys lie Ser Glu Asp Gly Val Arg His Gin Phe Leu Glu Thr Trp Val 
2260 2265 2270 

Pro Ala His Gin Pro Cys Gin He Cys Thr Cys Leu Ser Gly Arg Lys 
2275 2280 2285 

Val Asn Cys Thr Leu Gin Pro Cys Pro Thr Ala Lys Ala Pro Thr Cys 
2290 2295 2300 

Gly Pro Cys Glu Val Ala Arg Leu Arg Gin Asn Ala Val Gin Cys Cys 
2305 2310 2315 2320 

Pro Glu Tyr Glu Cys Val Cys Asp Leu Val Ser Cys Asp Leu Pro Pro 
2325 2330 2335 

Val Pro Pro Cys Glu Asp Gly Leu Gin Met Thr Leu Thr Asn Pro Gly 
2340 2345 2350 

Glu Cys Arg Pro Asn Phe Thr Cys Ala Cys Arg Lys Asp Glu Cys Arg 
2355 2360 236S 

Arg Glu Ser Pro Pro Ser Cys Pro Pro His Arg Thr Pro Ala Leu Arg 
2370 2375 2380 

Lys Thr Gin Cys Cys Asp Glu Tyr Glu Cys Ala Cys Asn Cys Val Asn 
2385 2390 2395 2400 

Ser Thr Val Ser Cys Pro Leu Gly Tyr Leu Ala Ser Ala Val Thr Asn 
2405 2410 2415 

Asp Cys Gly Cys Thr Thr Thr Thr Cys Phe Pro Asp Lys Val Cys Val 
2420 2425 2430 

His Ara Gly Thr Tie Tyr Pr~ Val ^ v n 1 - ^1- ^ r 1 
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Val Ala Gin Cys Ser Gin Lys Pro Cys Glu Asp Asn Cys Leu Ser Gly 
2465 2470 2475 2480 

Phe Thr Tyr Val Leu His Glu Gly Glu Cys Cys Gly Arg Cys Leu Pro 
2485 2490 2495 

Ser Ala Cys Glu Val Val Thr Gly Ser Pro Arg Gly Asp Ala Gin Ser 
2500 2505 2510 

His Trp Lys Asn Val Gly Ser His Trp Ala Ser Pro Asp Asn Pro Cys 
2515 2520 2525 

Leu He Asn Glu Cys Val Arg Val Lys Glu Glu Val Phe Val Gin Gin 
2530 2535 2540 

Arg Asn Val Ser Cys Pro Gin Leu Asn Val Pro Thr Cys Pro Thr G^y 
2545 2550 2555 2 560 

Phe Gin Leu Ser Cys Lys Thr Ser Glu Cys Cys Pro Thr Cys His Cys 
2565 2570 2575 

Glu Pro Leu Glu Ala Cys Leu Leu Asn Gly Thr lie He Gly Pro Gly 
2580 2585 2590 

Lys Ser Leu Met He Asp Val Cys Thr Thr Cys Arg Cys Thr Val Pro 
2595 2600 2605 

Val Gly Val He Ser Gly Phe Lys Leu Glu Gly Arg Lys Thr Thr Cys 
2610 2615 2620 

Glu Ala Cys Pro Leu Gly Tyr Lys Glu Glu Lys Asn Gin Gly Glu Cys 
2625 2630 2635 2640 

Cys Gly Arg Cys Leu Pro He Ala Cys Thr He Gin Leu Arg Gly Gly 
2645 2650 2655 

Gin He Met Thr Leu Lys Arg Asp Glu Thr He Gin Asp Gly Cys Asp 
2660 2665 2670 

Ser His Phe Cys Lys Val Asn Glu Arg Gly Glu Tyr He Trp Glu Lys 
2675 2680 2685 

Arg Val Thr Gly Cys Pro Pro Phe Asp Glu His Lys Cys Leu Ala Glu 
2690 2695 2700 

Gly Gly Lys He Met Lys He Pro Gly Thr Cys Cys Asp Thr Cys Glu 
2705 2710 2715 2720 

Glu Pro Glu Cys Lys Asp He He Ala Lys Leu Gin Arg Val Lys Val 
2725 2730 2735 

Gly Asp Cys Lys Ser Glu Glu Glu Val Asp He His Tyr Cys Glu Qly 
2740 2745 2750 

Lys Cys Ala Ser Lys Ala Val Tyr Ser He His Met Glu Asp Val Gin 
2755 2760 2765 

Asp Gin Cys Ser Cys Cys Ser Pro Thr Gin Thr Glu Pro Met Gin Val 
2770 2775 2780 

Ala Leu Arg Cys Thr Asn Gly Ser Leu He Tyr His Glu He Leu Asn 
2785 2790 2795 2 800 

Ala He Glu Cys Arg Cys Ser Pro Arg Lys Cys Ser Lys 
2805 2810 
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WE CLAIM: 

1 An isolated nucleic acid comprising a nucleotide sequence encoding 
canrne von Willebrand Factor polypeptide. 

2 The isolated nucleic acid of Claim 1 , wherein the nucleotide sequence 
is capabte of hybhdizing under high stringency conditions to SEQ ID NO 1 

3 The isolated nucleic acid of Claim 1 , wherein the nucleotide sequence 
encodes the Scottish temer von Willebrand Factor polypeptide. 

4 The isolated nucleic acid of Claim 2, wherein the nucleotide sequence 
encodes the Scottish terrier von Willebrand Factor polypeptide. 

5 A vector comprising the nucleic acid of Claim 1. 

6 A vector comprising the nucleic acid of Claim 2. 

7 A cell comprising the vector of Claim 5. 

8. A cell comprising the vector of Claim 6. 

9. An isolated nucieic acid comprising a nucleotide sequence encoding 
defective canine von Willebrand Factor polypeptide. 

1 0. The isolated nucleic acid of Claim 9, wherein the nucleotide sequence 
is capable of hybridizing under high stringency conditions to the complement of SEQ 
ID NO. 1 having a base deletion at codon 88. 

11. A vector comprising the nucleic ackJ of Claim 9. 
12 A vector comprising the nucieic acid of Claim 10. 

v .-orrorisiri'.; vecrc 
14 A cell compnsing the vector of Claim 12. 
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15 An isolated oligonucleotide sequence consisting of contiguous nudeic 
aads of the nucleotide sequence of SEQ ID NO 1 and capable of specrfically 
hybndizmg with the canine von Willebrand Factor gene. 



16 An isolated oligonucleotide sequence consisting of contiguous nucleic 
5 acids of the nucleotide sequence that is complementary to the sequence of SEQ ID 
NO. 1 and capable of specifically hybridizing with the canine von Willebrand Factor 
gene. 



17. A method of detecting a canine von Willebrand Factor gene in a 
sample comprising the steps of: 
10 a) contacting the sample with a oligonucleotide comprising 

contiguous nucleic acids of the nucleotide sequence of SEQ 
ID NO. 1 and capable of specifically hybridizing with the 
canine von Wiltebrand Factor gene, under conditions favorable 
for hybridization of the oligonucleotide to any complenr>entary 
1 5 sequences of nudeic acid in the sample; and 

b) detecting hybridization, thereby detecting a canine von 
Wfflebrand Factor gene. 



1 8. The method of Claim 17, further comprising the step of: 

c) quantifying hybridization of the oligonucleotide to 
20 complementary sequence. 

19. The method of Claim 17, wherein in SEQ ID NO 1 there is a base 
deletion at codon 88. 



20 An assay kit for screening for a canine von Willebrand Factor gene 
comprising: 

25 •) » oligonucleotide comprising contiguous nudeic acids of the 

nucleotide sequence of SEQ ID NO. 1 and capable of 
hybridizing with the canine von WUtebrand Factor gene; 
b) reagents for hybridization of the oligonucleotide to a 
complementary nucleic acid sequence; and 

30 c) container means for a)-b). 
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21. A method of detecting a canine von Willebrand Factor gene in a 
sample compnsjng the steps of 

a) contacting the sample with an oligonucleotide compnsing 
contiguous nucleic acids of the nucleotide sequence that is 
5 complementary to the sequence of SEQ ID NO 1 and capable 

of specifically hybridizing to the complementary nucleotide 
sequence, under conditions favorable for hybndization of the 
oligonucleotide to any complementary sequences of nucleic 
acid in the sample, and 
10 b) detecting hybridization, thereby detecting a canine von 

Willebrand Factor gene 

22. The method of Claim 21, further comprising the step of 

c) quantifying hybridization of the oligonucleotide to 
complementary sequences. 

15 23. The method of Claim 21 , wherein in SEQ ID NO 1 there is a base 

deletion at codon 88. 

24. An assay kit for screening for a canine von Willebrand Factor gene 

compnsing: 

a) an oligonucleotide comprising contiguous acids from the 
20 nucleotide sequence that is complementary to the sequence 

of SEQ ID NO. 1 and capable of specifically hybridizing to the 
complementary nucleotide sequence; 

b) reagents for hybridization of the oligonucleotide to a 
complementary nucteic acid sequence; and 

25 c) container means for a)-b). 



25. The assay kit of Claim 24, wherein in SEQ ID NO. 1 there is a base 
delebon at codon 88. 
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26 A method for detecting a mutated canine von Willebrand Factor gene 
in a canine DNA sample compnsing the steps of: 

a) amplifying the DNA sample by polymerase charn reaction to 
produce polymerase chain reaction products, wherein the 
polymerase chain reaction uses pnmers that produce a 
restriction site in a mutant allele but not in a normal allele; 

b) digesting the polymerase chain reaction products with a 
restriction enzyme specific to the restriction site of the 
restriction site pnmer to produce DNA fragments; and 

c) detecting the DNA fragments, thereby detecting a mutated 
canine von Willebrand Factor gene. 

27 The method of Claim 26 t wherein the primers are those of Figure 4. 

28. The method of Claim 26, wherein the DNA fragments are detected by 
gel electrophoresis. 

29. The method of Claim 27, wherein the restriction enzyme is fls/EI 

30. The method of Claim 27, wherein the restriction enzyme is Sat/96 I. 

31 . An oligonucleotide probe capable of detecting a mutation associated 
with canine von WiBebrand's disease, wherein the mutation is a base deletion at 
codon 88 of the canine von Willebrand Factor gene. 
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FIGURE 1A 



1 CATTAANAGG TCCTGGCTGG GAGCTTTTTT TTGGGACCAG CACTCCATGT TCAAGGGCAA 
61 ACAGGGGCCA ATTAGGATCA ATCTTTTTTC TTTCTTTTTT TAAAAAAAAA AATTCTTCCC 
121 ACTTTGCACA CGGACAGTAG TACATACCAC TAGCTCTCTG CGAGGACGGT GATCACTAAT 
181 CATTTCTCCT GCTTCGTGGC AGATGACTCC TACCAGACTT GTGAGGGTGC TGCTGGCTCT 
241 GGCCCTCATC TTGCCAGGGA AACTTTGTAC AAAAGGGACT GTTGGAAGGT CATCGATGGC 
301 CCGATGTAGC CTTCTCGGAG GTGACTTCAT CAACACCTTT GATGAGAGCA TGTACAGCTT 
361 TGCGGGAGAT TGCAGTTACC TCCT G GCTGG GGACTGCCAG GAACACTCCA TCTCACTTAT 
421 CGGGGGTTTC CAAAATGACA AAAGAGTGAG CCTCTCCGTG TATCTCGGAG AATTTTTCGA 
4 81 CATTCATTTG TTTGTCAATG GTACCATGCT GCAGGGGACC CAAAGCATCT CCATGCCCTA 
541 CGCCTCCAAT GGGCTGTATC TAGAGGCCGA GGCTGGCTAC TACAAGCTCT CCAGTGAGGC 
601 CTACGGCTTT GTGGCCAGAA TTGATGGCAA TGGCAACTTT CAAGTCCTGC TGTCAGACAG 
661 ATACTTCAAC AAGACCTGTG GGCTGTCTGG CAACTTTAAT ATCTTTGCTG AGGATGACTT 
721 CAAGACTCAA GAAGGGACGT TGACTTCGGA CCCCTATGAC TTTGCCAACT CCTGGGCCCT 
7S1 GAGCAGTCGG GAA CA& CGCT GGAAACGGGT OTCCCCTCCC AGCAGCCCAT GGAA2 Vi'Cl C 
841 CTCTGATGAA GTGCAGCAGC TCCTGTCGGA GCAGTGCCAG CTCCTGAAGA GTGCCTCGGT 
901 GTTTGCCCGC TGCCACCCGC TGGTGGACCC TGAGCCTTTT GTCGCCCTGT CTGAAAGGAC 
961 TCTGTGCACC TGTGTCCAGG GGATGGACTG CCCTTGTGCG GTCCTCCTGG AGTACGCCCG 
1021 GGCCTGTGCC CAGCAGGGGA TTGTCTTGTA CGGCTGGACC GACCACAGCG TCTGCCGACC 
10B1 AGCATGCCCT GCTGGCATGG AGTACAAGGA GTGCGTGTCC CCTTGCACCA GAACTTGCCA 
1141 GAGCCTTCAT GTCAAAGAAG TGTGTCAGGA GCAATCTGTA GATCGCTGCA GCTGCCCCGA 
1201 GGGCCAGCTC CTGGATGAAG GCCACTGCCT GGGAAGTGCT GAGTGTTCCT GTGTGCATGC 
1261 TGGGCAACGG TACCCTCCGG GCGCCTCCCT CTTACAGGAC TGCCACACCT GCATTTGCCG 
1321 AAATAGCCTG TGGATCTGCA GCAATGAAGA ATCCCCAGGC GAGTGTCTGG TCACAGGACA 
1381 GTCCCACTTC AAGACCTTCG ACAACACGTA CTTCACCTTC AGTGGGGTCT GCCACTACCT 
14 41 GCTGGCCCAG GACTGCCAGG ACCACACATT CTCTGTTGTC ATAGAGACTG TCCAGTGTGC 
1501 CGATGACCTG GATGCTGTCT GCACCCGCTC GGTCACCGTC CGCCTGCCTG GACATCACAA 
1561 CAGCCTTGTG AAGCTGAAGA ATGGGGGAGC AGTCTCCATG GATGGCCAGG ATATCCAGAT 
1621 TCCTCTCCTG CAAGGTGACC TCCGCATCCA GCACACCGTG ATGGCCTCCG TGCGCCTCAG 
1681 CTACGGGGAG GACCTGCAGA TGGATTCGGA CGTCCGGGGC AGGCTACTGG TGACGCTGTA 
1741 CCCCGCCTAC GCGGGGAAGA CCT G CGCCCG TGGCGGGAAC TACAACGGCA ACCGGGGGGA 
1B01 CGACTTCGTG ACGCCCGCAG GCCTGGCGGA GCCCCTGGTG GAGGACTTCG GGAACGCCTG 
1861 GAAGCTGCTC GGGGCCTGCG AGAACCTGCA GAAGCAGCAC CGCGATCCCT GCAGCCTCAA 
1921 CCCGCCCCAG GCCAGGTTTG CGGAGGAGGC GTGCGCGCTG CTGACCTCCT CGAAGTTCGA 
1981 GCCCTGCCAC CGAGCGGTGG CTCCTCAGCC CTACGTGCAG AACTGCCTCT ACGACGTCTG 
2041 CTCCTGCTCC GACGGCAGAG ACTGTCTTTG CAGCGCCGTC CCCAACTACG CCGCAGCCGT 
2101 GGCCCGGAGG GGCOTGCACA TCGCGTGGCG GGAGCCGGGC TTCTGTGCGC TGAGCTGCCC 
2161 CCAGGGCCAG GTGTACCTGC AGTCTOGCAC CCCTTCCAAC ATGACCTGTC TCTCCCTCTC 
2221 TTACCOGGAG GAGGACTGCA AXGAGOTCTG CTTGGAAAGC lUC TAVl UX CCCCAGGGCT 
2211 CTACCTGGAT GAGAGGGGAG ATTGTGTGCC CAAGGCTCAG TGTCCCT G TT ACTATGATGG 
2341 TGAGATCTTT CAGCCCGAAG ACATCTTCTC AGACCATCAC ACCATGTGCT ACTGTGAGGA 
2401 TOOCTTCATG CACTGTACCA CAAGTGGAOG CCTGGGXAOC C1 GC T G CC CA ACCCGGTOCT 
2461 CAGCAGCCCC CGGTGTCACC GCAGCAAAAC GAGCCTGTCC TGTOGGCCCC CCATGGTCAA 
2521 GTTGCTGTOT CCCGCTGATX ACCOOAOGOC TGAAGGACTO GACTOTGCCA AAACCTOCCA 
2511 GAACTATGAC CTGCAGTGCA TGAGCACAOG CTO TOT C T C C CGCTOCCTCT OCCCGCAOGO 
2641 CATOGTCCGQ CATGAAAACA GGTGTCJTGGC GCTOGAAAOA TOTCCCT G CT TCCACCAAGO 
2701 CCAAGACTAC CCCCCAOGAC AAACCGTGAA AAXTGACTGC AACACTTGTG TCTGTCOGGA 
27«l CCGGAACTGG ACCTGCACAO ACCATGTOTG TGATGCCACT TOCTCT G CCA TCGGCATOGC 
2S21 GCACTACCTC ACCTTCGACG GACTCMGTA CCT G TTCCCT CCWACTGCC AGTATGTTCT 
2B81 GGTGCAOGAT TACTCCGCCA GTAACCCTGG GACCTTACGG AT CCTGGTGO GGAACGAGOG 
2941 GTGCAGCTAC CCCTCAGTGA AATGCAAGAX CCGGGTCACC ATCCTGGTGG AAGGAGOAGA 
irs ^ CATTGAACTC TTTGATGGGG AGGTGAATCT GAAGAAACCC ATGAAGGATG A£ACTCACT~ 
.AJCTGC'; *AJTCTGGT- AGTACGTCA1 ~CT^ TTGCTv, ^GCAAGGCA, CTrTGTGG 
^SGACCA:: ■^CrCTGAGCA TTGTCAT ~CTGAAGCGG kCATACCAG^ .hGCAGGTGT' 
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FIGURE 1B 



31B1 TGGCCTGTCT GGGAATTTTG ATGGCATCCA GAACAATGAT TTCACCAGCA GCAGCCTCCA 
3241 AATAGAAGAA GACCCTGTGG ACTTTGGGAA TTCCTGGAAA GTGAACCCGC AGTGTGCCGA 
3301 CACCAAGAAA GTACCACTGG ACTCATCCCC TCCCGTCTGC CACAACAACA TCATGAAGCA 
3361 GACGATGGTG GATTCCTCCT GCAGGATCCT CACCAGTGAT ATTTTCCAGG ACTCCAACAG 
3421 GCTGGTGGAC CCTGAGCCAT TCCTGGACAT TTGCATCTAC GACACTTGCT CCTGT G ACTC 
3481 CATTGGGGAC TGCACCTGCT TCTGTGACAC CATTGCTGCT TACGCCCACG TCTGTGCCCA 
3 541 GCATGCCAAG GTGGTAGCCT GGAGGACAGC CACATTCTCT CCCCAGAATT GCGAGGAGCG 
3601 GAATCTCCAC GAGAATGGGT ATGAGTCTGA GTGGCGCTAT AACAGCTCTG CCCCTCCCTC 
3661 TCCCATCACG TGCCAGCACC CCGAGCCACT GGCATGCCCT GTACAGTGTG TTGAAGGTTG 
3721 CCATGCGCAC TGCCCTCCAG GGAAAATCCT GGATGAGCTT TTGCAGACCT GCATCGACCC 
37B1 TGAAGACTGT CCTGTGTGTG AGGTGGCTGG TCGTCGCTTG GCCCCAGGAA AGAAAATCAT 
3841 CTTGAACCCC AGTGACCCTG AGCACTGCCA AATTTGTAAT TGTGATGGTG TCAACTTCAC 
3 9 01 CTGTAAGGCC TGCAGAGAAC CCGGAAGTGT TGTGGTGCCC CCCACAGATG GCCCCATTGG 

3 961 CTCTACCACC TCGTATGTGG AGGACACGTC GGAGCCGCCC CTCCATGACT TCCACTGCAG 

4 021 CAGGCTTCTG GACCTGGTTT TCCTGCTGGA TGGCTCCTCC AAGCTGTCTG AGGACGAGTT 
4 061 TGAAGTGCTG AAGGTCTTTG TGGTGGGTAT GATGGAGCAT CTGCACATCT CCCAGAAGCG 
4141 GATCCGCGTG GCTGTGGTGG AGTACCACGA CGGCTCCCAC GCCTACATCG AGCTCAAGGA 
4 2 01 CCGGAACCGA CCCTCAGAGC TGCGGCGCAT CACCAGCCAG GTGAAGTACG CGGGCAGCGA 
4 261 GGTGGCCTCC ACCAGTGAGG TCTTAAAGTA CACGCTGTTC CAGATCTTTG GCAAGATCGA 
4 321 CCGCCCGGAA GCGTCTCGCA TTGCCCTGCT CCTGATGGCC AGCCAGGAGC CCTCAAGGCT 
4 3B1 GGCCCGGAAT TTGGTCCGCT ATGTGCAGGG CCTGAAGAAG AAGAAAGTCA TTCTCATCCC 
4 441 TGTGGGCATC GGGCCCCACG CCAGCCTTAA CCAGATCCAC CTCATAGAGA AGCAGGCCCC 
4 5 01 TGAGAACAAG GCCTTTGTGT TCAGTGC7GT GGATCAGTTG GAGCAGCGAA GGGATGAGAT 
4 561 TATCAACTAC CTCTGTGACC TTCCCCCCGA AGCACCTGCC CCTACTCAGC ACCCCCCAAT 
4 6 21 GGCCCAGGTC ACGCTGGCTT CGGAGCTGTT GGGGCTTTCA TCTCCAGOAC CCAAAAGGAA 
4681 CTCCATGGTC CTGGATGTGG TGTTTGTCCT GGAAGGGTCA GACAAAATTG GTGAGGCCAA 
4 741 CTTTAACAAA AGCAGGGACT TCATGGAGGA CCTGATTCAC CCGATGGACG TGGGCCAGGA 
4 801 CAGGATCCAC GTCACAGTGC TGCAGTACTC GTACATGGTG ACCGTGGAGT ACACCTTCAG 
4 961 CGAGCCGCAG TCCAAGGGCG AGGTCCTACA CCAGGTGCGC GATATCCGAT ACCGGGCTGG 
4 921 CAACAGGACC AACACTGGAC TGGCCCTGCA ATACCTGTCC GAACACAGCT TCTCGGTCAG 
4981 CCAGGGGGAC CGGGAGCAGG TACCTAACCT GGTCTACATG GTCACAGGAA ACCCCGCTTC 
5041 TGATGAGATC AAGCGGATGC CTGGAGACAT CCAGGTGGTG CCCATCGGGG TGGGTCCACA 
5101 TGCCA ATGTG CAGGAGCTGG AGAAGATTGG CTGCCCCAAT GCCCCCATCC TCATCCATGA 
5161 CTTTGAGATG CTCCCTCGAG AGGCTCCTCA TCTG G TGCTA CAGAGGTGCT GCTCTGGAGA 
5221 CGGGCTGCAG AXCCCCACCC TCTCCCCCAC CCCAGATTGC AGCCAGCCCC TGGATGTGCT 
5281 CCTCCTCCTG GATGCCTCTT CCAGCATTCC AGCTTCTTAC TTTGATGAAA TGAAGAGCTT 
5341 CACCAAGGCT TTTATTTCAA GAGCTAATAT AGGCCCCCGG CTCACTCAAG TGTCGCTGCT 
5401 GCAATATGGA AGCATCACCA CTATCCATGT GCCTTGGAAT GTAGCCTATG AGAAAGTCCA 
S461 TTTACTGAGC CTTCTGGACC TCATGCAGCA GGACGGAGGC CCCACCGAAA TTGGGGATGC 
5521 TTTGA OCTTT CCCGTCCGAT ATGTCACCTC AGAAGTCCAT GGTGCCACGC CCGGACCCTC 
5581 GAAAGCGGTG CTTATCCTAG TCACAGATGT CTCCGTGGAT TCAGTGGATG CTGCAQCCGA 
5641 GGCCCCCAGA TCCAACCGAG TGACAGTGTT CCC CA TTGGA ATCGGGGATC CGTACAGTGA 
5701 GaCCCAGCTO AGCAGCTTGO CACGCCCAAA GGCTGGCTCC AATATGGTAA CGCTCCAGCG 
5761 AAT TGAAG AC CTCCCCACCG TGQCCACCCT OOGAAATTCC TTCTTCCACA A GC TCTCCTC 
S821 TQMTTTGAT XBAGTTTOCO TOGATGAGOA TGGGAATGAG AAGACGCCCC GGGATGTCTC 
5881 CX 5EI?5^ GACCAGTGCC ACACAGTGAC TTCCCTGQCA GATGGCCAGA CCTTOCTCAA 
5941 GAol CAli^ CTCAACTGTQ ACCOOGOGCC AAGGCCTTCQ TOCCCCAATG CCCACCCCCC 
6001 TCTCACGCT A GAGOAGACCT CTQ C CTGCC U CTOGACCTCT CCCTOTCTQT GCATOGGCAC 
6061 CT CTACCCCG CACATCOTGA CCTTTGATOG GCAGAATTTC AAGCTGACTG CCAGCTGTTC 
6121 GlAi UlCLT A TTTCAAAACA AGGAGCAGGA CCTG G AGGTG ATTCTCCAGA ATGGTGCCTG 
6181 C AGCC CTCGC CCGAAGGAGA CCTGCATGAA ATCCATTCAG GTGAAGCATG ACGGCCTCTC 
6241 AGT TGAGCTC CACAGTGACA TGCAGATGAC AGTGAATGGG AGACTAGTCT CCATCCCATA 
6301 TGTGCGTGGA CACATGGAAG TCAATGTTTA TCGGACCATC ATGTATOAGG TCAGATTCAA 
63 ei CCATCTTGGC CACATCTTCA CATTCACCCC CCAAAACAAT GAGTTCCAGC TGCAGCTCAG 
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FIGURE 1C 

6421 CCCCAGGACC TTTCCTTCGA AGACATATGC 
£4 81 CAATGACTTC ATTCTGAGGG ATGGGACACT 
6541 ATGGACCGTA CAGCAGCTTG GGAAGACATC 
6601 CTCCGAATTC TTCCACTGCC AC G TCCTCCT 
6 661 CCTCGCTCCA GCCACCTTTT ATGCCATGTG 
6721 GTOTGAGGCG ATTGCCTTGT ATGCCCACCT 
6781 GAGGAGGGCC AATTTCTGTG CTATCTCATG 
6841 CCATGGCTGC CCTCGGCTCT CTGAAGGCAA 

6 901 AGGCTGCTTC TGCCCCCCAA ACCAAGTCAT 
6S61 CTGTACCCAG TGCATCAGCG AGGATGGACT 

7 021 AGCCCACCAG CCTTGCCAGA TCTGCACGTG 
7081 GCAGCCCTGC CCCACAGCCA AAGCTCCCAC 
7141 CCAGAACGCA GTGCAGTGCT GCCCGGAGTA 
72 01 CCTGCCCCCG GTGCCTC?CT GCGAAGATGG 

72 61 CTGCAGACCC AACTTCACCT GTGCCTGCAG 

73 21 CTCTTGTCCC CCGCACCGGA CGCCGGCCCT 

73 81 CTGTGCATGC AACTGTGTCA ACTCCACGGT 

74 41 TGTCACCAAC GACTGTGGCT GCACCACAAC 
7 SOI CCGAGCCACC ATCTACCCTG TGGGCCAGTT 
7 561 CACGGACTTG GAGGACTCTG TGATGGGCCT 
7 621 TGAGGACAAC TGCCTGTCAG GCTTCACTTA 
76 81 GTGTCTGCCA TCTGCCTGTG AGGTGGTCAC 
7 741 CTGGAAGAAT GTTGGCTCTC ACTGGGCCTC 
7 801 TGTCCGAGTG AAGGAAGAGG TCTTTGTGCA 

7 861 TGTCCCCACC TGCCCCACGG GCTTCCAGCT 
7S21 CTCTCACTGC GAGCCCCTGG AGGCCTGCTT 
79B1 AAGTCTGATG ATTGATCTGT GTACAACCTG 

8 041 TGGATTCAAG CTGGAGGGCA GGAAGACCAC 
P101 AGAGAAGAAC CAAGGTGAAT GCTCTGGGAG 
8161 AAGAGGAGGA CAGATCATGA CACTGAAGCC 
8221 TCACTTCTGC AAGGTCAATG AAAGAGGAGA 
8281 CCCACCTTTC GATGAACACA AGT C TCT GC C 
8 341 CACCTGCTGT GACACATGTG AGGAGCCXGA 
84 01 TGTCAAACTG GGAGACTGTA AGTCTGAACA 
8461 ATGTGCCAGC AAAGCCGTGT ACTCCATCCA 
8 521 CTGCTCGCCC ACCCAGACGG AGCCCATGCA 
8 581 CATCTACCAT GAGATCCTCA ATGCCATGGA 
8641 GTGAGGCCAC TGCCTGGATG CTACTGTCCC 
6701 GAGTGCTGCT CAGTCCTCCT CAGTCCTCCT 
6761 CAATAAAGGT CAATCTTTCA CCTTGAAAAA 



TCTCTGTGGG ATCTGTGATG AGAACGGAGC 
CACCACAGAC TGGAAGGCAC TCATCCAGGA 
CCAGCCTCTC CATGAGGAGC AGTGTCCTGT 
CTCAGAATTG TTTGCCGAGT GCCACAAGGT 
CCAGCCCGAC ACTTCCCACC CGAAGAAAGT 
CTGTCGGACC AAAGGGGTCT GTGTGGACTG 
TCCACCATCC CTGGTGTACA ACCACTGTGA 
TACAAGCTCC TGTGGGGACC AACCCTCGGA 
GCTGGAAGGT AGCTGTGTCC CCGAGGAGGC 
CCGGCACCAG TTCCTGGAAA CCTGGGTCCC 
CCTCAGTGGG CGGAAGGTCA ACTCTACGTT 
CTGTGGCCCG TGTGAAGTGG CCCGCCTCCG 
CGAGTCTGTG TGTGACCTGG TGAGCTGTGA 
CCTCCAGATG ACCCTGACCA ATCCTGGCGA 
GAAGGATGAA TGCAGACGGG AGTCCCCGCC 
TCGGAAGACT CAGTGCTGTG ATGAGTATGA 
GAGCTGCCCG CTTGGGTACC TGGCCTCGGC 
AACCTGCTTC CCTGACAAGG TGTGTGTCCA 

CTGGGAflGAG gcctgtgacg tctgcacctg 
GCGTGTGGCC CAGTGCTCCC AGAAGCCCTG 
TGTCCTTCAT GAAGGCGAGT GCTGTGGAAG 
TGGTTCACCA CGGGGCGACG CCCAGTCTCA 
CCCTGACAAC CCCTGCCTCA TCAATGAGTG 
ACAGAGGAAT GTCTCCTGCC CCCAGC7GAA 
GAGCTGTAAG ACCTCAGAGT GTTGTCCCAC 
GCTCAATGGT ACCATCATTG CGCCGGGGAA 
CCGCTGCACC GTGCCGCTGG GACTCATCTC 
CTGTGAGGCA TGCCCCCTGG GTTATAAGGA 
ATGTCTGCCT ATAGCTTGCA CCATTCAGCT 
TGATGAGACT ATCCAGGATG GCTCTCACAG 
GTACATCTGG GAGAAGAGAG TCACGGGTTG 
TGAGGGAGGA AAAATCATGA AAATTCCAGG 
ATGCAAGGAT ATCATTGCCA AGCTGCAGCG 
CGAAGTGCAC ATTCATTACT CTGAGGGTAA 
CATGGAGGAT GTGCAGGACC AGTGCTCCTG 
GGTGGCCCTG CGCTGCACCA ATGGCTCCCT 
ATGCAGGTGT TCCCCCAGGA AGTGCAGCAA 
CTGCCTTACC CGACCTCACT CGACTGGCCA 
CCTGCTCTGC T C T TG I G CTT CCTGATCCCA 
AAAAAAAAAA AA 
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Human KIPARFAGVUJUJU-ILPGTLCAEGTRGI^STARC^ 6 0 

Dog -S-T-LVR K- -TK- -V- - - -M L-G- - I E D 

• 

Human I^CX^raSFSIIGDFQNGKRVSl^VYI^ 12 o 

Dog D--EH-I-L--G D ML--T-SI K 

Human ETEAGYYXLSGEAYGFVARIDGSGNFOVLIJ;^ FAJEDDFMTQEGTL 180 

Dog -A S K k 

Human TSDPYDFANSWAI^SGEQWCERASPPSSSCNXSSGE?^RGX,WEOCO^ 24 0 

Dog - R-K-V P--V--D-V-QV A 

Human VD PEP FVALCEKTLCECAGG1X CACP AIXEYARTCAQEGMVTrfYGVTDHS ACS PVCP ACME 300 

Dog R T-VQ-M- -P-AV A Q-I V-R-A 

Human YRQCVS PCARTCQSUIINEMCOERC\TCCSCPECQLI^EGLrVZSTECPCVHSGKRYPPG 3 6 0 

Dog -KE T VK-V Q H--G-A--S A-Q 

Human TSLSPIX^TCICP^SQWICSKEECPGECLVT^ <20 

Dog A- -L0- -K L - v-H Q 

Human HSTS IVIETVQCADDRDAVCTRSVTVRLPGUWSLVX^ 4 8 0 

Dog -T- -V l H N-G--S I-I---Q.-- 

Hunan R I QHTVTASVRLS YG ED LOMDWDGRGRLLVKLS PVYAGKTCGLCGKYKGNQGDDFLTPS G 54 0 

M S-V T-Y-A RG R.-.-V--A- 

Kuman l^PRVEDFGJttKTOtfGDCQDUJKQHSDPCA^ 6 00 
L L-A-EN R---S OA- -A L SK--P G 

Hunan PLPYLRNCRYDVCSCSDGRECLCCAIASYAAACAGRGVRVA^ 6€ 0 

Dog -0--VQ--L D---S-V-N V-R---KI F-A-S--Q 



Human CCTPCNtTCRSI^YTDEECNEACIXGCFCPPGL^ 720 
Dog M--L E-D---V---S--S L-4--4 



Hunan I FSDKHTKCYCITCFTffiCTMSGVPGSIXPDAVI^SPI^HROT 7 80 

Dog T--GL HP RC 

Human UL*TGI^CT)CTCQfroi£CMSH^^ 8 40 

^ * A Q...T O- Q 

Human TTOGCNIWCRDRJCKKCTDHV^^ 900 

Dog D — -t X— - 

Human NPGTniXLVCNl^KPSVXOT , 60 

^ E-.-y K q 

Hunan YIIUXCKAI^WVDRKl^ISVVIJOTYO^^ 1020 

Dog "^""•"••~-----HR-----T*-tt--*-0---- r c 

Kum»n FGKSH1CV«OCADIWWPU)SSPATCIOOTMI(OT^ 1080 



FIGURE 2A 
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Human l^VCIYDTCSCESICDCACFCDTIAAYWT/CAQ 114 C 

Dog ~-I T A F M H 

Human ECEWRYNSCAJ>ACQVTCQHPEPXACPVQCVEGO{AJIC^ 12 0C 

Dog PI I 

Human V AC RR FAS GXJTVTLNP SDPEHCQICH CD WKLTCTA CQE P GGLWP PTDAPVS PTTLYVE 126 0 

Dog L-P---II ---N--G--F--K--R---SV G-IGS--S--- 

Human DISEPPI^FTCSRI^LVFU^SSRL^EAXFFVXKAFVVDf^ 132 0 

Dog -T H K D V G H-H RI 

Human YHDGSKAYIGLKDRKRPSELRJlIASOVKYAGSQVAiiTSEVLFrifT^ 13 BO 

Dog E T E G 

Human ALL1>IASQEPQR^RKFVRYVQGLKKX)CVIVI PVGIGPHANLKQIRXIEKQAPENKAFVL 14 4 0 

Dog S-LA--L - s H F 

Human SSV^ZLEQQaDZI VSYLCOLAPZAPPPTLPPKKAQ^^ 15 00 

Dog -G R IN A--0H-P SE SP V 

Hurr.an FVLEGSDKIGEADFNRSKXF^FyiOFJOVGODSIKVTVM^ 156 0 

Dog N--K-R R T t 

Human ILQRVTlEIRYQGGNRT>rTGLALRYI-SDHS FLVSQGDRTQAPtfLV^TVTGJiPASDEIKRLP 16 20 

Dog V--Q.-D R- 0 E S- v w . 

Human GDIQVVPICVCPNAWOrLERIGVP^PXLXODrETL^ 166 0 

Dog H K K---M - 

Human SPAPDCSOPI^ILLUXSSSSFPASYFDEMKSFAKAFISKANIGPRLTQVSVl^YGSITT 17<0 

Dog --T V 1 t R 

Human IDVPfcWVTEJCAKIXSLVDVMQ^ 1BC0 

Dog AY- -V- - L--Q E S V---V 

Huma n TD VS VDS VDAAADAARS>7R VTVFP I G I GDR YDAAQLRI LAG P AGDSNWKLOXI EDLPTT. I860 

009 E SE SS KAG--M-R V 

Human ^^XGNSFIJDXCSaFVRXCroII>CNITJ^^ 1920 

Dog A F --D-V-V ^ s 

Human RGUlPSCTKSOSPVT^rrCCCRVrrCPCVrTC HBO 

Dog --P G-P-Lfc H 

Human EQDIXVXUOIGACSPGARCX3CMKSI 2 04 0 

005 Q KTT DC QM I D— 

HuffiAn "V** 1 * 1 ** 1 ™^^ 2100 

Dog T--Y - R _ 

Human GTVTTDWKTLVO DTT\^ 2l60 

Dog - A-T------CT V C W - err- 

IUUKL 2b 
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Humar, 
Dog 

Human 
Dog 

Human 
Dog 



AICOODSCTQEQVCEVIASyAHI/^TT^CVDWRTPDrCAMSCPPSLVYKHCEKGCPRHC 2 2 2 0 

-M--P PKK- --A- K RAN 

DGNVSSCGDHPSEGCFCPPDJCVMIXGSCVPEEACTOCIGE^^ 22 8 0 

E--T q KO S R T A 

CTCI^GRKVNCTTQPCPTAXAPTCGLCIVARIA^ CDLPPVPHC 23 4 0 
L P V L P- 



Human 
Dog 

Human 
Dog 

Human 
Dog 



ERGI^PTLTNPGECRFHFTCACRKEECIOIVSPPSCPPHRL^ 2 4 0 0 

-D---H - D--R-E T-A 



STVSCPLCTIASTATyDCG lJll I I CLroKVrVHRSTIYPVCQrWEEGCDVCTCTDKEDAV 
AV F- G A L--S- 



KGIJtVAQCSOKPCEDSCRSGFIVVUlEGECCCRCLPSAC^ 

--A--H--N---H 



2460 



2520 



Kuran WASPENPCLXNECVRVTCEEVFIQQRWSCPOLE^ 

Dog D V N--T--T E T-K--PL- 



2550 



KurAn ACMUKTIVIGPGTaVTaDVCTTCRCKVOVGVISG^^ 

Dog - - L 1 SL T-P G EA X-Q-- - 



2640 



Huaan 
Dog 



CGRCLPTACTIQLRGGQICTI~KRDETLQDGCDTK7CKV^"^ 

1 3 S I 



2700 



Kur-an C^AZGCKIKKIPGTCCDTCEEPECKDITARU>yVKVGSCKSEVrVDIHYC^ 

Dcg K--I-K--R D E E V- 



2760 



Ku=an SIDIhTOVQDQCSCCSFnOTPMQV;a^Crra^ 

Dog - -KME R LI---I---I--R 



2EI3 



FIGURE 2C 
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exon 4 AAATGACAAAAGAGTGAGCCGGTC* 
AGGGGGTTTCCAAAATGACAAAAGAGTGAGCCTCT^^ 

GGFQNDKRVSLSVY LGEFFD 

CATTCATTTGTTTGTCAATGGTACCATGCT 

IHLFVNGTMLQGTQR 

GAATGTTCAGGTTAATATGGACCCTGGGGATCACTTTGCAACCCCCT^ 



GAGGGAGCCGGGGOZCAGAGACAGGAAGTAAATGTGCCCAGGGAAAGTGAGTGGCAGGAC 



7GGGTGAAAGCCCCATATCCCGACTCCTGGTCAAGGAGACTTTGCACCAAGGTCCCAGCC 
3 • - GGGCTGGCGACCAGTTCCTCTGAA - 5 ' 

CTGGAGCATGGGGTTGGGGTTGGAAGGTGGAGGGACATGGAGGAAATGCATGAGAAGCAC 

exon 5 

GCTTCCTGAGCTCCTCCTTGTCCCACCAGCATCTCCATC 

ISMPYASNG 
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